如何在Python中将长正则表达式规则拆分为多行

萧嘉禧

2023-03-14

问题内容：

这实际上可行吗？我有一些很长的正则表达式模式规则，这些规则很难理解，因为它们无法一次放入屏幕。例：

test = re.compile('(?P<full_path>.+):\d+:\s+warning:\s+Member\s+(?P<member_name>.+)\s+\((?P<member_type>%s)\) of (class|group|namespace)\s+(?P<class_name>.+)\s+is not documented' % (self.__MEMBER_TYPES), re.IGNORECASE)

反斜杠或三重引号将不起作用。

编辑。我结束使用VERBOSE模式。现在是正则表达式模式的外观：

test = re.compile('''
  (?P<full_path>                                  # Capture a group called full_path
    .+                                            #   It consists of one more characters of any type
  )                                               # Group ends                      
  :                                               # A literal colon
  \d+                                             # One or more numbers (line number)
  :                                               # A literal colon
  \s+warning:\s+parameters\sof\smember\s+         # An almost static string
  (?P<member_name>                                # Capture a group called member_name
    [                                             #   
      ^:                                          #   Match anything but a colon (so finding a colon ends group)
    ]+                                            #   Match one or more characters
   )                                              # Group ends
   (                                              # Start an unnamed group 
     ::                                           #   Two literal colons
     (?P<function_name>                           #   Start another group called function_name
       \w+                                        #     It consists on one or more alphanumeric characters
     )                                            #   End group
   )*                                             # This group is entirely optional and does not apply to C
   \s+are\snot\s\(all\)\sdocumented''',           # And line ends with an almost static string
   re.IGNORECASE|re.VERBOSE)                      # Let's not worry about case, because it seems to differ between Doxygen versions

问题答案：

您可以通过引用每个段来分割正则表达式模式。无需反斜杠。

test = re.compile(('(?P<full_path>.+):\d+:\s+warning:\s+Member'
                   '\s+(?P<member_name>.+)\s+\((?P<member_type>%s)\) '
                   'of (class|group|namespace)\s+(?P<class_name>.+)'
                   '\s+is not documented') % (self.__MEMBER_TYPES), re.IGNORECASE)

您还可以使用原始字符串标志，'r'并且必须将其放在每个段之前。

参见文档。

如何在Python中将长正则表达式规则拆分为多行

相关阅读

相关文章

相关问答

相关工具

相关文档