擅长:python、mysql、java
<p>问题1:</p>
<p><em>单数/复数:</em>
为了让事情顺利进行,我会使用一个python包influct来消除单数和复数之类的。。。你知道吗</p>
<p>问题2:</p>
<p>拆分和合并:</em>
我写了一个小脚本来演示如何使用它,虽然没有经过严格的测试,但应该能让你动起来</p>
<pre><code>import inflect
p = inflect.engine()
lookup_table = ['cats', 'cute kittens', 'dog litter park']
tweets = ['that is a cute cat',
'kittens are cute',
'that is a cute kitten',
'that is a dog litter park',
'no wonder that dog park is bad']
for tweet in tweets:
matched = []
for lt in lookup_table:
match_result = [lt for mt in lt.split() for word in tweet.split() if p.compare(word, mt)]
if any(match_result):
matched.append(" ".join(match_result))
print tweet, '>>' , matched
</code></pre>