<p>我修改了你的例子,因为<code>word1</code>从技术上讲是在<code>word11</code>和<code>word12</code>中,我不认为这是你的意思。你知道吗</p>
<h3>设置</h3>
<pre><code>from StringIO import StringIO
import pandas as pd
text_bad_words = """ words
0 _word1_
1 _word3_
2 _word5_
3 _word13_
4 _word16_"""
text_data = """s.no column1 column2 column3 column4
1 aaa_word1_b aaa_word2_b aaa_word3_b aaa_word4_b
2 aaa_word5_b aaa_word6_b aaa_word7_b aaa_word8_b
3 aaa_word9_b aaa_word10_b aaa_word11_b aaa_word12_b
4 aaa_word13_b aaa_word14_b aaa_word15_b aaa_word16_b
5 aaa_word17_b aaa_word18_b aaa_word19_b aaa_word20_b"""
bad_words = pd.read_csv(
StringIO(text_bad_words), squeeze=True, index_col=0, delim_whitespace=True)
data = pd.read_csv(
StringIO(text_data), squeeze=True, index_col=0, delim_whitespace=True)
</code></pre>
<h3>解决方案</h3>
<p>我将使用<code>regex</code>和<code>contains</code></p>
<pre><code>regex = r'|'.join(bad_words)
regex
'_word1_|_word3_|_word5_|_word13_|_word16_'
</code></pre>
<p>创建布尔掩码</p>
<pre><code>mask = data.stack().str.contains(regex).unstack().any(1)
mask
s.no
1 True
2 True
3 False
4 True
5 False
dtype: bool
</code></pre>
<hr/>
<pre><code>data.loc[~mask]
</code></pre>
<p><a href="https://i.stack.imgur.com/lHtVo.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/lHtVo.png" alt="enter image description here"/></a></p>