我正在为Python3.7.5中的bigram的条件频率分布制表而努力。我有大约60个文本,我可以使用.plot
命令成功地可视化结果
以下是用于打印的ConditionalFreqDist的代码:
cfd = nltk.ConditionalFreqDist(
(textname, bigen)
for textname in eng_corpus.fileids()
for bigen in nltk.bigrams([w.lower() for w in eng_corpus.words(fileids=textname) if w not in engstops and w.isalnum()]))
然而,当我尝试制表时,我得到以下结果:
>>> cfd.tabulate()
Traceback (most recent call last):
File "<pyshell#185>", line 1, in <module>
cfd.tabulate()
File "C:\Users\gavrk\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nltk\probability.py", line 1979, in tabulate
width = max(len("%s" % s) for s in samples)
File "C:\Users\gavrk\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nltk\probability.py", line 1979, in <genexpr>
width = max(len("%s" % s) for s in samples)
TypeError: not all arguments converted during string formatting
我是Python新手,所以任何帮助都将不胜感激
目前没有回答
相关问题 更多 >
编程相关推荐