擅长:python、mysql、java
<p>您可以匹配<code>keepWord</code>列表中的整个单词,并且在其他上下文中仅替换两个或多个相同字母的序列:</p>
<pre><code>import re
sentence = 'hello, join this meeting heere using thiis lllink'
keepWord = ['hello','meeting']
new_sentence = re.sub(fr"\b(?:{'|'.join(keepWord)})\b|([^\W\d_])\1+", lambda x: x.group(1) or x.group(), sentence)
print(new_sentence)
# => hello, join this meeting here using this link
</code></pre>
<p>见<a href="https://ideone.com/Z9Qj1B" rel="nofollow noreferrer">Python demo</a></p>
<p>正则表达式看起来像</p>
<pre><code>\b(?:hello|meeting)\b|([^\W\d_])\1+
</code></pre>
<p>见<a href="https://regex101.com/r/vsqa8a/1/" rel="nofollow noreferrer">regex demo</a>。如果组1匹配,则返回其值,否则,将返回完全匹配(要保留的单词)</p>
<p><strong>图案细节</strong></p>
<ul>
<li><code>\b(?:hello|meeting)\b</code>-<code>hello</code>或<code>meeting</code>用单词边界括起来</li>
<li><code>|</code>-或</li>
<li><code>([^\W\d_])</code>-第1组:任何Unicode字母</li>
<li><code>\1+</code>-对组1值的一个或多个反向引用</li>
</ul>