Regex-Python如何找到长度最小的字符串

2024-09-30 16:31:55 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我们有下面的文本

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

我想在两个粗体字之间匹配文本

当我使用the.*Pagemaker时,文本的很大一部分是从'the'的第一个实例匹配到Pagemaker,而不是从离它最近的the实例匹配的

你能帮我吗


Tags: andofthetext文本typewithit
2条回答

这是一个棘手的问题,但我认为使用negative lookahead可能会奏效:

 the(?!.*the).*PageMaker

在这里,我们正在寻找一个以“the”开头并以“PageMaker”结尾的匹配,但它本身并不通过?!操作符包含“the”

签出regex101.com以查看这是否适合您

试着在文章之前使用一些东西

import re
txt="Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum."

phrase_get=re.search(r'1960s with the.+PageMaker',txt)[0]
print(phrase_get)

相关问题 更多 >