我在使用googlenews的w2v嵌入时遇到问题。
我下载了GoogleNews-vectors-negative300.bin.gz
并在运行之后
gensim.models.KeyedVectors.load_word2vec_format('/home/slava/GoogleNews-vectors-negative300.bin.gz', binary=True)
我搞错了
IOerror: not a gzipped file
好的,我在控制台中运行gzip GoogleNews-vectors-negative300.bin
,然后
file GoogleNews-vectors-negative300.bin.gz
现在说,它实际上是gzip压缩的数据。
但是跑步
gensim.models.KeyedVectors.load_word2vec_format('/home/slava/GoogleNews-vectors-negative300.bin.gz', binary=True)
现在回来了
ValueError: need more than 0 values to unpack
完全回溯:
> ValueError Traceback (most recent call
> last) <ipython-input-9-c4eebc3bcdb0> in <module>()
> 1
> 2 from gensim.models import Word2Vec
> ----> 3 model = gensim.models.KeyedVectors.load_word2vec_format('/home/slava/GoogleNews-vectors-negative300.bin.gz',
> binary=True)
>
> /home/slava/anaconda2/lib/python2.7/site-packages/gensim/models/keyedvectors.pyc
> in load_word2vec_format(cls, fname, fvocab, binary, encoding,
> unicode_errors, limit, datatype)
> 205 with utils.smart_open(fname) as fin:
> 206 header = utils.to_unicode(fin.readline(), encoding=encoding)
> --> 207 vocab_size, vector_size = map(int, header.split()) # throws for invalid file format
> 208 if limit:
> 209 vocab_size = min(vocab_size, limit)
>
> ValueError: need more than 0 values to unpack
怎么解决这个问题?在
文件已损坏,重新下载已解决
相关问题 更多 >
编程相关推荐