re.search将组(1)的结尾选项放在组(2)的开头。我的正则表达式构造有什么问题吗?

2024-09-23 22:21:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图根据常见的字符串将段落解析为数据字段。举个例子:

tstStr = 'Locations of performance are California, North Carolina and Pennsylvania, with a Sept. 14, 2017, performance completion date.'
pperf = '([Ww]ork will be performed [(in)(at)]|[Ll]ocation[(s )\s] of performance [(is)(are)])(.*?)( and (the work )?is expected| with a(.*)completion date)'
pTest = re.search(pperf, tstStr)

预期结果是:

pTest.group(2)
California, North Carolina and Pennsylvania,

相反,我得到:

pTest.group(2)
re California, North Carolina and Pennsylvania,

第一组怎么写得不对

谢谢你


Tags: andofdateisperformancewithcompletionare
1条回答
网友
1楼 · 发布于 2024-09-23 22:21:49

关键是你的[(is)(are)]实际上是一个character class匹配的1个符号,(is,等等。你需要一个非捕获组,(?:is|are)匹配isare字符序列

使用正则表达式

([Ww]ork will be performed (?:in|at)|[Ll]ocations? of performance (?:is|are))\s*(.*?)( and (the work )?is expected| with a(.*)completion date)

参见regex demo

相关问题 更多 >