擅长:python、mysql、java
<pre><code>import re
nounA=[]
with open('col1.txt', "rb") as opened_colA:
for aLine in opened_colA:
nounA.append(aLine.strip())
patterns = [r'\b%s\b' % re.escape(s.strip()) for s in nounA]
col1 = re.compile('|'.join(patterns))
nounB=[]
with open('col2.txt', "rb") as opened_colA:
for aLine in opened_colA:
nounB.append(aLine.strip())
patterns = [r'\b%s\b' % re.escape(s.strip()) for s in nounB]
col2 = re.compile('|'.join(patterns))
with open('test1.txt', "rb") as opened_colA:
for aLine in opened_colA:
if col1.search(aLine):
if col2.search(aLine):
print aLine
# just write aline to your output file.
</code></pre>
<p><strong>解释:</strong>首先,我将<code>colA</code>中的所有单词取出来,并生成一个正则表达式;与<code>col2</code>类似。现在用这个正则表达式搜索输入文件并打印结果</p>
<p><code>'\b'</code>是单词边界。如果您正在搜索一个单词<code>'cat'</code>,但它可能会找到<code>'catch'</code>,<code>'\b'</code>很有用,因此只查找单词<code>'cat'</code>。你知道吗</p>