Appendix C. A Sed and Awk Micro-Primer
This is a very brief introduction to the sedand awktext processing utilities. We willdeal with only a few basic commands here, but that will sufficefor understanding simple sed and awk constructs within shellscripts.
sed: a non-interactive text file editor
awk: a field-oriented pattern processinglanguage with a C-like syntax
For all their differences, the two utilities share a similarinvocation syntax, both use regularexpressions , both read input by defaultfrom stdin, and both output tostdout. These are well-behaved UNIX tools,and they work together well. The output from one can be pipedinto the other, and their combined capabilities give shellscripts some of the power of Perl.
One important difference between the utilities isthat while shell scripts can easily pass arguments to sed, itis more complicated for awk (see Example 33-5and Example 9-23).
Table C-1. Basic sed operators
Operator | Name | Effect |
---|---|---|
[address-range]/p | Print [specified address range] | |
[address-range]/d | delete | Delete [specified address range] |
s/pattern1/pattern2/ | substitute | Substitute pattern2 for first instance of pattern1 in a line |
[address-range]/s/pattern1/pattern2/ | substitute | Substitute pattern2 for first instance of pattern1 in aline, over address-range |
[address-range]/y/pattern1/pattern2/ | transform | replace any character in pattern1 with thecorresponding character in pattern2, overaddress-range(equivalent oftr) |
g | global | Operate on everypattern matchwithin each matched line of input |
Unless the g(global) operator is appended to asubstitutecommand, the substitutionoperates only on the first instance of a pattern match withineach line.
From the command line and in a shell script, a sed operation mayrequire quoting and certain options.
1 sed -e '/^$/d' $filename 2 # The -e option causes the next string to be interpreted as an editing instruction. 3 # (If passing only a single instruction to "sed", the "-e" is optional.) 4 # The "strong" quotes ('') protect the RE characters in the instruction 5 #+ from reinterpretation as special characters by the body of the script. 6 # (This reserves RE expansion of the instruction for sed.) 7 # 8 # Operates on the text contained in file $filename. |
In certain cases, a sedediting command willnot work with single quotes.
1 filename=file1.txt 2 pattern=BEGIN 3 4 sed "/^$pattern/d" "$filename" # Works as specified. 5 # sed '/^$pattern/d' "$filename" has unexpected results. 6 # In this instance, with strong quoting (' ... '), 7 #+ "$pattern" will not expand to "BEGIN". |
Sed uses the -eoptionto specify that the following string is an instruction or setof instructions. If there is only a single instruction containedin the string, then this option may be omitted.
1 sed -n '/xzy/p' $filename 2 # The -n option tells sed to print only those lines matching the pattern. 3 # Otherwise all input lines would print. 4 # The -e option not necessary here since there is only a single editing instruction. |
Table C-2. Examples of sed operators
Notation | Effect |
---|---|
8d | Delete 8th line of input. |
/^$/d | Delete all blank lines. |
1,/^$/d | Delete from beginning of input up to, and includingfirst blank line. |
/Jones/p | Print only lines containing "Jones"(with |
-noption).s/Windows/Linux/Substitute "Linux"for first instanceof "Windows"found in each input line.s/BSOD/stability/gSubstitute "stability"for every instanceof "BSOD"found in each input line.s/ *$//Delete all spaces at the end of every line.s/00*/0/gCompress all consecutive sequences of zeroes intoa single zero./GUI/dDelete all lines containing "GUI".s/GUI//gDelete all instances of "GUI", leaving theremainder of each line intact.
Substituting a zero-length string for another is equivalentto deleting that string within a line of input. This leaves theremainder of the line intact. Applying s/GUI//
to the line
The most important parts of any application are its GUI and sound effects |
results in
The most important parts of any application are its and sound effects |
A backslash forces the sedreplacementcommand to continue on to the next line. This has the effect ofusing the newlineat the end of the firstline as the replacement string.
1 s/^ */\ 2 /g |
This substitution replaces line-beginning spaces with anewline. The net result is to replace paragraph indents with ablank line between paragraphs.
An address range followed by one or more operations may requireopen and closed curly brackets, with appropriate newlines.
1 /[0-9A-Za-z]/,/^$/{ 2 /^$/d 3 } |
This deletes only the first of each set of consecutive blanklines. That might be useful for single-spacing a text file,but retaining the blank line(s) between paragraphs.
A quick way to double-space a text file is sed Gfilename.
For illustrative examples of sed within shell scripts, see:
- Example 33-1
- Example 33-2
- Example 12-3
- Example A-2
- Example 12-15
- Example 12-24
- Example A-12
- Example A-17
- Example 12-29
- Example 10-9
- Example 12-43
- Example A-1
- Example 12-13
- Example 12-11
- Example A-10
- Example 17-12
- Example 12-16
- Example A-28
For a more extensive treatment of sed, check the appropriatereferences in the Bibliography.
Notes
[1] | If no address range is specified, the defaultis alllines. |