问题：

Python正则表达式匹配撇号

司马自明

2023-03-14

re_newspeaker =         r'^(<bullet> |  )(?P<name>(%s|(((Mr)|(Ms)|(Mrs))\. [-A-Za-z \']+( of [A-Z][a-z]+)?))|((The ((VICE|ACTING|Acting) )?(PRESIDENT|SPEAKER|CHAIR(MAN)?)( pro tempore)?)|(The PRESIDING OFFICER)|(The CLERK)|(The CHIEF JUSTICE)|(The VICE PRESIDENT)|(Mr\. Counsel [A-Z]+))( \([A-Za-z.\'\- ]+\))?)\.'


re_speaking =           r'^(<bullet> |  )((((((Mr)|(Ms)|(Mrs))\. [A-Za-z \'\-]+(of [A-Z][a-z]+)?)|((The (VICE |Acting |ACTING )?(PRESIDENT|SPEAKER)( pro tempore)?)|(The PRESIDING OFFICER)|(The CLERK))( \([A-Za-z.\'\- ]+\))?))\. )?(?P<start>.)'

出于某种原因，上面的正则表达式没有捕捉带撇号的名称。

例如：D'STALL先生不匹配。任何关于regex模式的帮助都将不胜感激。

代码所做的是获取输入并用XML标记它。诸如以下内容：

<speaker=Mr. D'STALL</speaker><speaking>Mr. President, I have been seeking to obtain a report on
this bill. I am not on the Budget Committee, and I am not on the
Government Relations Committee. But from what I understand, this is a
very important bill, a big bill, a complex bill, far reaching in its
contents. I have been queried, along with all other Senators, I
suppose, as to whether or not they would have any objection to the
adoption of the committee amendments, en bloc. I am going to object to
the adoption of the committee amendments, en bloc, until I see the
committee report.</speaking>

  Mr. D'STALL. Mr. President, I have been seeking to obtain a report on
this bill. I am not on the Budget Committee, and I am not on the
Government Relations Committee. But from what I understand, this is a
very important bill, a big bill, a complex bill, far reaching in its
contents. I have been queried, along with all other Senators, I
suppose, as to whether or not they would have any objection to the
adoption of the committee amendments, en bloc. I am going to object to
the adoption of the committee amendments, en bloc, until I see the
committee report.

正则表达式与上述段落不匹配。

共有1个答案

杜诚

2023-03-14

re_newspeaker =         r'^(<bullet> |  )(?P<name>(%s|(((Mr)|(Ms)|(Mrs))\. [-A-Z\']+|((Miss) [-A-Z\']+)( of [A-Z][a-z]+)?))|((The ((VICE|ACTING|Acting) )?(PRESIDENT|SPEAKER|CHAIR(MAN)?)( pro tempore)?)|(The PRESIDING OFFICER)|(The CLERK)|(The CHIEF JUSTICE)|(The VICE PRESIDENT)|(Mr\. Counsel [A-Z]+))( \([A-Za-z.\- ]+\))?)\.'

re_speaking =           r'^(<bullet> |  )((((((Mr)|(Ms)|(Mrs))\. [A-Z\']+|((Miss) [-A-Z\']+)(of [A-Z][a-z]+)?)|((The (VICE |Acting |ACTING )?(PRESIDENT|SPEAKER)( pro tempore)?)|(The PRESIDING OFFICER)|(The CLERK))( \([A-Za-z.\- ]+\))?))\. )?(?P<start>.)'

上面的正则表达式解决了我的问题。我想如果其他人有这个问题，我会发布它！

类似资料：

>>正则表达式匹配

我们得到了一些这样的内容：
与正则表达式匹配的正则表达式

有没有人试图描述与正则表达式匹配的正则表达式？由于重复的关键字，这个主题几乎不可能在网上找到。它可能在实际应用程序中不可用，因为支持正则表达式的语言通常具有解析它们的方法，我们可以将其用于验证，以及一种在代码中分隔正则表达式的方法，可用于搜索目的。但是我仍然想知道匹配所有正则表达式的正则表达式是什么样子的。应该可以写一个。
Python正则表达式匹配日期

问题内容：我在Python中使用什么正则表达式来匹配这样的日期：“ 11/12/98”？问题答案：与其使用正则表达式，不如将字符串解析为对象通常更好：然后，您可以访问日，月和年（以及小时，分钟和秒）作为对象的属性：要测试用正斜杠分隔的数字序列是否表示有效日期，可以使用一个块。无效的日期将引发：如果您需要搜索更长的字符串以获取日期，则可以使用正则表达式来搜索以正斜杠分隔的数字：当然，无
正则表达式 - 匹配规则

主要内容：基本模式匹配,字符簇,确定重复出现基本模式匹配一切从最基本的开始。模式，是正则表达式最基本的元素，它们是一组描述字符串特征的字符。模式可以很简单，由普通的字符串组成，也可以非常复杂，往往用特殊的字符表示一个范围内的字符、重复出现，或表示上下文。例如：这个模式包含一个特殊的字符 ^，表示该模式只匹配那些以 once 开头的字符串。例如该模式与字符串 "once upon a time" 匹配，与 "There once was
Java正则表达式匹配

问题内容：当字符串以数字开头时，我需要匹配，然后是一个点，然后是一个空格和1个或多个大写字符。匹配必须发生在字符串的开头。我有以下字符串。我尝试过的正则表达式是：它不匹配。一个有效的正则表达式将对这个问题有什么作用？问题答案：（对不起，我先前的错误。大脑现在坚定地投入了。嗯，也许。）这有效：分解： =字符串开头 =一个或多个数字（之所以转义，是因为它在字符串中，因此） =文字（或者
正则表达式匹配-Java

问题内容：我从以下格式的文件中获取输入：现在，我想在我的Java代码中读取int1，int2，int3和int4。我该如何在Java中使用正则表达式匹配。谢谢。问题答案：为了避免空值：

Python正则表达式匹配撇号

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档