擅长:python、mysql、java
<p>IIUC,目标是在前面有大写的单词时不匹配。检查之前是否有一个非大写的单词会消除许多合法的可能性</p>
<p>下面是一个正则表达式,它可以提供更多的可能性(句子开头,非单词之前):</p>
<pre><code>regex = '|'.join(fr'(?:\b[^A-Z]\S*\s+|[^\w\s] ?|^){i}' for i in long_list)
df['count'] = df['text'].str.count(regex)
</code></pre>
<p>例如:</p>
<pre><code> text count
0 Kevin McDonald has bought a burger. 0
1 The best burger in McDonald is cheeze buger. 1
2 McDonald's restaurants. 1
3 Blah. McDonald's restaurants. 1
</code></pre>
<p>您可以测试并理解regex<a href="https://regex101.com/r/M8nKfw/1" rel="nofollow noreferrer">here</a></p>