<p>您可以使用负前瞻来断言单词不只是由数字或单词字符组成,而不是由数字和用连字符分隔的部分组成</p>
<pre><code>(?<!\S)(?!(?:\d+|[^\W\d_]+)(?:-(?:\d+|[^\W\d_]+))*(?!\S))\S+
</code></pre>
<ul>
<li><code>(?<!\S)</code>左边的空白边界</li>
<li><code>(?!</code>负前瞻
<ul>
<li><code>(?:\d+|[^\W\d_]+)</code>匹配没有数字的数字或单词字符和<code>_</code></li>
<li><code>(?:</code>非捕获组作为一个整体重复
<ul>
<li><code>-(?:\d+|[^\W\d_]+)</code>匹配<code>-</code>和之前的相同模式</li>
</ul>
</li>
<li><code>)*</code>关闭非捕获组并可选地重复</li>
<li><code>(?!\S)</code>在右侧断言空白边界</li>
</ul>
</li>
<li><code>)</code>关闭前瞻</li>
<li><code>\S+</code>匹配1+非空白字符</li>
</ul>
<p><a href="https://regex101.com/r/3DgxUv/1" rel="nofollow noreferrer">Regex demo</a><a href="https://ideone.com/ZDUE7z" rel="nofollow noreferrer">Python demo</a></p>
<pre><code>import re
pattern = r"(?<!\S)(?!(?:\d+|[^\W\d_]+)(?:-(?:\d+|[^\W\d_]+))*(?!\S))\S+"
s = ("Tokyo2020\n"
"Tokyo!2020\n"
"covid-19\n"
"cov!d-19\n"
"Oompa-L00mpa\n"
"double-barreled\n"
"double barreled\n"
"test-t9")
result = re.sub(pattern, "[ ]", s)
print(result)
</code></pre>
<p>输出(其中<code>[ ]</code>可以是空格)</p>
<pre><code>[ ]
[ ]
covid-19
[ ]
[ ]
double-barreled
double barreled
[ ]
</code></pre>