擅长:python、mysql、java
<p>其思想是在digrams上迭代,而不是单个单词,因此您始终可以将前面的单词作为可用上下文:</p>
<pre><code>words = [('there', 'EX'), ('is', 'VBZ'), ('a', 'DT'), ('huge', 'JJ'), ('shaggy', 'NN'), ('dog', 'NN'), ('in', 'IN'), ('the', 'DT'), ('yard', 'NN')]
next(((token1, i)
for i, ((token1, pos1), (token2, pos2))
in enumerate(zip(words, words[1:]))
if pos2 == 'IN'
), None)
# => ('dog', 5)
</code></pre>