擅长:python、mysql、java
<p>您可以在传递到函数<code>nltk.collocations.BigramCollocationFinder.from_words</code>之前删除重复的单词</p>
<pre><code>words = 'this this is is a a test test'.split()
removed_duplicates = [first for first, second in zip(words, ['']+words) if first != second]
output:
['this', 'is', 'a', 'test']
</code></pre>
<p>然后做:</p>
<pre><code>b = nltk.collocations.BigramCollocationFinder.from_words(removed_duplicates)
b.ngram_fd.keys()
</code></pre>