擅长:python、mysql、java
<p>你可以在标记化过程中删除停止词。。。在</p>
<pre><code>stop_words = frozenset(['the', 'a', 'is'])
def mostCommonWords(concordanceList):
finalCount = Counter()
for line in concordanceList:
words = [w for w in line.split(" ") if w not in stop_words]
finalCount.update(words) # update final count using the words list
return finalCount
</code></pre>