<p>这里的典型解决方案是使用捕获和非捕获正则表达式组。由于regex交替从左到右进行解析,因此将任何<em>异常</em>放在规则的第一位(作为非捕获),并以要为其选择的交替结束</p>
<pre><code>import re
text = "Once upon a time, in a time far far away, dogs ruled the world. The End."
query = re.compile(r"""
Once upon a time| # literally 'Once upon a time',
# should not be selected
time\b # from the word 'time'
(.*) # capture everything
\bend # until the word 'end'
""", re.X | re.I)
result = query.findall(text)
# result = ['', ' far far away, dogs ruled the world. The ']
</code></pre>
<p>您可以去掉空组(当我们匹配不需要的字符串时放入的)</p>
<pre><code>result = list(filter(None, result))
# or result = [r for r in result if r]
# [' far far away, dogs ruled the world. The ']
</code></pre>
<p>然后去掉结果</p>
<pre><code>result = list(map(str.strip, filter(None, result)))
# or result = [r.strip() for r in result if r]
# ['far far away, dogs ruled the world. The']
</code></pre>
<p>当你有很多要回避的短语时,这个解决方案特别有用</p>
<pre><code>phrases = ["Once upon a time", "No time like the present", "Time to die", "All we have left is time"]
querystring = r"time\b(.*)\bend"
query = re.compile("|".join(map(re.escape, phrases)) + "|" + querystring, re.I)
result = [r.strip() for r in query.findall(some_text) if r]
</code></pre>