18 正则表达式 - 18.1 正则表达式简介




  • 一组字符串。这是仅仅表示字面意思的字符串。最简单形式的正则表达式仅仅包含一组字符串。
  • 一个锚字符。锚节点指定了正则表达式在一行文本中的匹配位置。例如,^和$就是锚字符。
  • 修饰符。修饰符扩展或者限定(修改)了正则表达式在文本中的匹配范围。修饰符包括星号、方括号和反斜线。

正则表达式的主要用在文本搜索和字符串操作。一个正则表达式匹配单个字符或者一组字符 — 一系列的字符或者字符串的一部分。

  • 星号 * 匹配前面的子表达式任意次,包括0次


  • 点号 . 匹配任意字符,除了新的一行[2]


  • 脱字符 ^ 匹配行的起始位置,但有时候会根据上下文环境匹配其相反的意义(译者注:例如[^a]匹配任意一个非a的字符)
  • 美元符 $ 匹配行的结束位置


  • 方括号 […] 匹配所包含的任意一个字符


  • 反斜线 转义一个特殊字符,意味着这个字符被解释为字面意义(因此不再包含特殊意思)


  • 转义后的尖括号 <..> 代表词组的边界

    ““匹配词组”the”,而不是词组”them,” “there,” “other,”等等

  1. bash$ cat textfile
  2. This is line 1, of which there is only one instance.
  3. This is the only instance of line 2.
  4. This is line 3, another line.
  5. This is line 4.
  6. bash$ grep 'the' textfile
  7. This is line 1, of which there is only one instance.
  8. This is the only instance of line 2.
  9. This is line 3, another line.
  10. bash$ grep '<the>' textfile
  11. This is the only instance of line 2.


  1. 测试文件: tstfile # No match.
  2. # No match.
  3. 运行 grep "1133*" tstfile # Match.
  4. # No match.
  5. # No match.
  6. This line contains the number 113. # Match.
  7. This line contains the number 13. # No match.
  8. This line contains the number 133. # No match.
  9. This line contains the number 1133. # Match.
  10. This line contains the number 113312. # Match.
  11. This line contains the number 1112. # No match.
  12. This line contains the number 113312312. # Match.
  13. This line contains no numbers at all. # No match.
  1. bash$ grep "1133*" tstfile
  2. Run grep "1133*" on this file. # Match.
  3. This line contains the number 113. # Match.
  4. This line contains the number 1133. # Match.
  5. This line contains the number 113312. # Match.
  6. This line contains the number 113312312. # Match.


[1] 元意义指的是一个词组或者表达式在更高层次的抽象上的意义。例如,正则表达式的字面意思就是所有人接受其用法的普通表达式。元意义则完全不同,正如在本章最终讨论的那样。
Since sed, awk, and grep process single lines, there will usually not be a newline to match. In those cases where there is a newline in a multiple line expression, the dot will match the newline.

  1. #!/bin/bash
  2. sed -e 'N;s/.*/[&]/' << EOF # Here Document
  3. line1
  4. line2
  5. EOF
  6. # OUTPUT:
  7. # [line1
  8. # line2]
  9. echo
  10. awk '{ $0=$1 "n" $2; if (/line.1/) {print}}' << EOF
  11. line 1
  12. line 2
  13. EOF
  14. # OUTPUT:
  15. # line
  16. # 1
  17. # Thanks, S.C.
  18. exit 0