<p>这应该很有趣()</p>
<p>如果输入是<code>velvet evening purse bags</code>,而所需的输出是@MrGeek使用<code>itertools.combinations</code>生成的,那实际上就是<code>skipgrams</code>来自<a href="https://tedboy.github.io/nlps/generated/generated/nltk.skipgrams.html" rel="nofollow noreferrer">https://tedboy.github.io/nlps/generated/generated/nltk.skipgrams.html</a>的定义</p>
<p>因此,您可以通过以下方法实现相同的效果:</p>
<pre><code>from nltk import skipgrams
s = 'velvet evening purse bags'
tokens = word_tokenize(s)
list(skipgrams(tokens, n=2, k=len(tokens)-1))
</code></pre>
<p>[输出]:</p>
<pre><code>[('velvet', 'evening'),
('velvet', 'purse'),
('velvet', 'bags'),
('evening', 'purse'),
('evening', 'bags'),
('purse', 'bags')]
</code></pre>
<p>在这种情况下,<strong>每个单词只能与它右边的另一个单词组合,这在某种程度上符合人类英语。你知道吗</p>
<p>在这种情况下,所有单词的“排列”都成对出现,甚至连单词本身也成对出现:</p>
<pre><code>from itertools import product
s = 'velvet evening purse bags'
tokens = set(word_tokenize(s))
list(product(tokens, tokens))
</code></pre>
<p>[输出]:</p>
<pre><code>[('velvet', 'velvet'),
('velvet', 'evening'),
('velvet', 'purse'),
('velvet', 'bags'),
('evening', 'velvet'),
('evening', 'evening'),
('evening', 'purse'),
('evening', 'bags'),
('purse', 'velvet'),
('purse', 'evening'),
('purse', 'purse'),
('purse', 'bags'),
('bags', 'velvet'),
('bags', 'evening'),
('bags', 'purse'),
('bags', 'bags')]
</code></pre>