表列双随机数的频率分布

2024-09-28 01:24:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在为Python3.7.5中的bigram的条件频率分布制表而努力。我有大约60个文本,我可以使用.plot命令成功地可视化结果

以下是用于打印的ConditionalFreqDist的代码:

cfd = nltk.ConditionalFreqDist(
    (textname, bigen)
    for textname in eng_corpus.fileids()
    for bigen in nltk.bigrams([w.lower() for w in eng_corpus.words(fileids=textname) if w not in engstops and w.isalnum()]))

然而,当我尝试制表时,我得到以下结果:

>>> cfd.tabulate()
Traceback (most recent call last):
  File "<pyshell#185>", line 1, in <module>
    cfd.tabulate()
  File "C:\Users\gavrk\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nltk\probability.py", line 1979, in tabulate
    width = max(len("%s" % s) for s in samples)
  File "C:\Users\gavrk\AppData\Local\Programs\Python\Python37-32\lib\site-packages\nltk\probability.py", line 1979, in <genexpr>
    width = max(len("%s" % s) for s in samples)
TypeError: not all arguments converted during string formatting

我是Python新手,所以任何帮助都将不胜感激


Tags: inforlinenotcorpusengfiletabulate

热门问题