擅长:python、mysql、java
<p>我会这样做的。我认为查找表不必太严格,我们可以避免复数</p>
<pre><code>import re
lookup_table = ['cat', 'cute kitten', 'dog litter park']
tweets = ['that is a cute cat',
'kittens are cute',
'that is a cute kitten',
'that is a dog litter park',
'no wonder that dog park is bad']
for data in lookup_table:
words=data.split(" ")
for word in words:
result=re.findall(r'[\w\s]*' + word + '[\w\s]*',','.join(tweets))
if len(result)>0:
print(result)
</code></pre>