awk is a sophisticated file parser that is capable of manipulating data in columns and performing some high-level math operations. A reasonably general representation of a typical awk invocation is as follows:
awk 'BEGIN {commands} /pattern/{commands} END {commands}' file
The BEGIN and END steps are optional. They are used to issue commands before and after the parsing step, usually to first initialize counter variables and then print the result at the end. The pattern is compared against every line in the file and when a match occurs the associated commands are executed (see sed below for more info on pattern matching). Omitting pattern is equivalent to matching all lines. The commands can refer to columns in the data file, e.g. print $3 simply prints the third column (columns are delineated by ``whitespace''--any combination of tabs and spaces, up to the end of the line). You'll notice that the commands have a very C-like look and feel. Even the printf() function is supported, which makes awk a powerful way of cleaning up poorly formatted data files.
Here are a few examples:
awk 'BEGIN {nl=0} {nl++} END {print nl}' fileVariables are actually automatically initialized to zero, so the nl=0 step is unnecessary, but it's good practice. Also, awk has a special variable NR which contains the current line (record) number in the file, so
awk 'END {print NR}' file
would accomplish the same thing as this example.
awk 'BEGIN {sum=0} {sum += $4} END {printf("%.2f\n",sum/NR}' fileHere we used printf() to restrict the output to two places after the decimal. Note the
\n
to force a new line, just like
in C.
awk '/start/,/stop/{for (i=NF;i>0;i--) printf("%s ",$i); printf("\n")}' file
One handy argument to awk is -Fc, where c is any character. This character is used as the delimiter instead of whitespace, which can be useful for, say, extracting the month, day, and year digits from an entry like ``MM/DD/YY''.
There's lots more you can do with awk. In fact, there's an entire O'Reilly book on both awk and sed if you're interested...