RegEx如何只获取在大输出中重复的多行文本块？

JOB 1 CASSCF RESULTS *** Lots of text *** end NEVPT2 RESULTS *** Lots of text *** end JOB 2 CASSCF RESULTS *** Lots of text *** end NEVPT2 RESULTS *** Lots of text *** end ……………… JOB 31 CASSCF RESULTS *** Lots of text *** end NEVPT2 RESULTS *** Lots of text *** end

NEVPT2_Section = r"(?:AILFT MATRIX ELEMENTS $NEVPT2$\n-+\n\n)([\s\S]*)(?:\n\n--------------\nCASSCF TIMINGS)" NEVPT2_Section_mathes = re.finditer(NEVPT2_Section, inp_content, re.MULTILINE) for xyz in NEVPT2_Section_mathes: my_xyz = xyz.group(1) print(my_xyz)

2条回答

网友

1楼 · 编辑于 2024-10-01 22:28:37

作为一种替代方法，您可以匹配开始^NEVPT2.*\n处的行，并使用多行标志继续匹配所有不以end开头的行，使用负的lookahead (?!end$)

^NEVPT2.*\n(?:(?!end$).*\n)*end$

Regex demo| Python demo

例如

NEVPT2_Section = r"^NEVPT2.*\n(?:(?!end$).*\n)*end$"
NEVPT2_Section_mathes = re.finditer(NEVPT2_Section, inp_content, re.MULTILINE)

for xyz in NEVPT2_Section_mathes:
    print(xyz.group())

网友

2楼 · 编辑于 2024-10-01 22:28:37

你可以用

^NEVPT2.+?^end

在single和multiline模式下，请参见a demo on regex101.com

相关问题更多 >

编程相关推荐

热门问题

热门文章