擅长:python、mysql、java
<p>我会将项目平铺成单词,忽略任何停止词,而是将其作为单个<code>Counter</code>的输入:</p>
<pre><code>from collections import Counter
from itertools import chain
lines = [
"this is a concordance string something",
"this is another concordance string blah"
]
stops = {'this', 'that', 'a', 'is'}
words = chain.from_iterable(line.split() for line in lines)
count = Counter(word for word in words if word not in stops)
</code></pre>
<p>或者,最后一点可以作为:</p>
^{pr2}$