Gensim:ValueError:无法创建意图(缓存|隐藏)|可选数组必须已定义维度,但得到(0,)

2024-09-28 22:44:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试模拟一些文档的流式传输,并在传入的其他文档上更新LSI。我发现这个错误:

Traceback (most recent call last):
  File "gensimStreamGen_tutorial5.py", line 57, in <module>
    for vector in corpus_memory_friendly: # load one vector into memory at a time
  File "gensimStreamGen_tutorial5.py", line 44, in __iter__
    lsi = models.LsiModel(corpus, num_topics=10) # initialize an LSI transformation
  File "/Users/Desktop/gensim-0.12.0/gensim/models/lsimodel.py", line 331, in __init__
    self.add_documents(corpus)
  File "/Users/Desktop/gensim-0.12.0/gensim/models/lsimodel.py", line 388, in add_documents
    update = Projection(self.num_terms, self.num_topics, job, extra_dims=self.extra_samples, power_iters=self.power_iters)
  File "/Users/Desktop/gensim-0.12.0/gensim/models/lsimodel.py", line 126, in __init__
    extra_dims=self.extra_dims)
  File "/Users/Desktop/gensim-0.12.0/gensim/models/lsimodel.py", line 677, in stochastic_svd
    q, _ = matutils.qr_destroy(y) # orthonormalize the range
  File "/Users/Desktop/gensim-0.12.0/gensim/matutils.py", line 398, in qr_destroy
    qr, tau, work, info = geqrf(a, lwork=-1, overwrite_a=True)
ValueError: failed to create intent(cache|hide)|optional array-- must have defined dimensions but got (0,)

流式文档和更新LSI模型的代码:

^{pr2}$

语料库每次迭代都会得到一个新的向量。对于不同迭代,每个收益率的新_-vec:

[]
[(0, 1)]
[(1, 1), (2, 1), (3, 1)]
[(3, 2), (4, 1), (5, 1)]
[(2, 1), (6, 1), (7, 1)]
[]
[(8, 1)]
[(8, 1), (9, 1)]
[(9, 1), (10, 1), (11, 1)]

错误出现在第一次迭代(expected new_vec中的第一行)。剩下的是新的vec的预期输出。在


Tags: in文档pyselfmodelslinecorpusextra