擅长:python、mysql、java
<p><a href="http://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec" rel="noreferrer">A list of ^{<cd1>} sentences</a>。您还可以从磁盘流式传输数据。</p>
<p>确保是<code>utf-8</code>,然后将其拆分:</p>
<pre><code>sentences = [ "the quick brown fox jumps over the lazy dogs",
"Then a cop quizzed Mick Jagger's ex-wives briefly." ]
word2vec.Word2Vec([s.encode('utf-8').split() for s in sentences], size=100, window=5, min_count=5, workers=4)
</code></pre>