为什么我的Python正则表达式锚不能在多行字符串上正常工作?

2024-10-02 06:31:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在写一个脚本来清理已从PDF转换的文本文件。出于某种原因,锚定字符^$(匹配字符串的开头和结尾)在我的正则表达式中似乎不能正常工作。我在Linux上使用python3.6.6。你知道吗

为什么^Credits$与下面代码中的独立行Credits不匹配?你知道吗

>>> import re
>>> my_regex = r'^Credits$'
>>> my_string = "based upon extrinsic circumstances, as discussed in Serrano v. Priest, 20 Cal.3d 25, 49.\n\nCredits\n(Added by Stats.1977, c. 1197, p. 3979,  1. Amended by Stats.1993, c. 645 (S.B.764),  2.)"
>>> print(re.findall(my_regex,my_string))
[]

下面是print()函数显示的文本片段(my_string):

based upon extrinsic circumstances, as discussed in Serrano v. Priest, 20 Cal.3d 25, 49.

Credits
(Added by Stats.1977, c. 1197, p. 3979,  1. Amended by Stats.1993, c. 645 (S.B.764),  2.)

谢谢你的帮助。你知道吗


Tags: inrestringbymyasregexbased
1条回答
网友
1楼 · 发布于 2024-10-02 06:31:01

正如@CertainPerformance所说,在findall末尾使用re.M标志:

print(re.findall(my_regex,my_string,re.M))

演示:

>>> import re
>>> my_regex = r'^Credits$'
>>> my_string = "based upon extrinsic circumstances, as discussed in Serrano v. Priest, 20 Cal.3d 25, 49.\n\nCredits\n(Added by Stats.1977, c. 1197, p. 3979,  1. Amended by Stats.1993, c. 645 (S.B.764),  2.)"
>>> print(re.findall(my_regex,my_string,re.M))
['Credits']

或与r'(?m)^Credits$'一起使用:

>>> import re
>>> my_regex = r'(?m)^Credits$'
>>> my_string = "based upon extrinsic circumstances, as discussed in Serrano v. Priest, 20 Cal.3d 25, 49.\n\nCredits\n(Added by Stats.1977, c. 1197, p. 3979,  1. Amended by Stats.1993, c. 645 (S.B.764),  2.)"
>>> print(re.findall(my_regex,my_string,re.M))
['Credits']

相关问题 更多 >

    热门问题