Gensim LdaMallet分部

2024-06-02 22:03:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图复制gensim中木槌包装器的教程。http://radimrehurek.com/2014/03/tutorial-on-mallet-in-python/

当我把模型

model = models.LdaMallet(mallet_path, corpus, num_topics=10, id2word=corpus.dictionary)

我收到一条错误消息:

^{pr2}$

当我使用模型推断示例的主题分布时,分布是均匀的:

doc = "Don't sell coffee, wheat nor sugar; trade gold, oil and gas instead."
bow = corpus.dictionary.doc2bow(utils.simple_preprocess(doc))
print model[bow]

我的输出:

[(0, 0.10000000000000002), (1, 0.10000000000000002), (2, 0.10000000000000002), (3, 0.10000000000000002), (4, 0.10000000000000002), (5, 0.10000000000000002), (6, 0.10000000000000002), (7, 0.10000000000000002), (8, 0.10000000000000002), (9, 0.10000000000000002)]

这是包装机的问题还是木槌的问题?我已经成功地复制了mallet教程:http://programminghistorian.org/lessons/topic-modeling-and-mallet


Tags: and模型comhttpdocmodeldictionary教程