Command Line: process a file line by line

27 May, 2018
  • Share
Post image

Introduction

We can use a nice perl one liner in the terminal to process files line by line. We can do many interesting things with it:

  • Modify a file
  • Count lines that match a regular expression
  • Add a line before or after the matched line
  • Remove or keep only the lines that match our search
  • and more

Using Perl to Process Files

Lets go through a few examples. I created a file with a list of animals and named it animals.txt:

elephant lion bear

First, lets replace lion with Lion King in the content of this file:

perl -p -e '$_ = qq(Lion King\n) if /lion/' animals.txt

This command says the following:

Look for the word "lion" on each line and replace the whole line which matched the search with "Lion King" in the file animals.txt

The option -e executes the Perl code you give to it, while -p feeds each line of the input file to this code and prints it. The line itself is available in the $_ variable and the line number is in the $. Once your code finishes processing the line, Perl prints whatever is in the $_ variable.

This command prints the result to the screen, it won't edit the file. However, there is an option to edit the file in-place, -i[extension]. Perl renames the original file and writes all changes to a new file with original name. If you give an extension to this option (e.g., -i.bak), Perl backs up the original file by adding this extension to the file name. If you omit the extension, Perl deletes the original file on exit (if your system allows this). Therefore, to edit the file in-place without a backup, we'd add -i to the original command:

perl -p -i -e '$_ = qq(Lion King\n) if /lion/' animals.txt

To backup the original file, we'd add an extension to the -i option, like this:

perl -p -i.bak -e '$_ = qq(Lion King\n) if /lion/' animals.txt

To add a line after the matched one:

perl -p -e '$_ .= qq(new line\n) if /lion/' animals.txt

To add a line before the matched one:

perl -p -e '$_ = qq(new line\n$_) if /lion/' animals.txt

To print only the lines that matched, we can use the -n option instead of -p:

perl -n -e 'print $_ if /lion/' animals.txt

The -n option acts similar to -p. It allows you to process the file line by line, but it doesn't print the lines. You can print the lines using the print function.

To modify each line somehow, we could remove the if statement. The following command adds a number to each line:

perl -p -e '$_ = qq($. $_)' animals.txt

We could use the BEGIN and END blocks to count the lines that match a search. Lets count the lines that include either lion or bear:

perl -n -e 'BEGIN {$count = 0} END {print $count . "\n"} $count += 1 if /lion|bear/' animals.txt

In the BEGIN block, we initialise the $count variable and in the END block, we print it. For each line, we add 1 to the count, if it includes either lion or bear.

Conclusion

This is just a very little out of what Perl is capable of on the command line.

  1. perlrun - how to execute the Perl interpreter
  2. Perl Command-Line Options