Python KeyError:“”用于自动语言检测

2024-10-03 17:24:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我在python中使用stopwords进行自动语言检测

但是我在尝试测试代码时遇到了KeyError。 这是密码

import nltk
from nltk.corpus import stopwords

def scoreFunction(wholetext):
    dictiolist={}
    scorelist={}
    NLTKlanguage = ["dutch","finnish","german","italian","portuguese","spanish","turkish","danish","english"," french","hungarian","norwegian","russian","swedish"]
    FREElanguages = [""]
    languages= NLTKlanguages + FREElanguages
    for lang in NLTKlanguages:
        dictiolist[lang]=stopwords.words(lang)
        tokens=nltk.tokenize.word_tokenize(wholetext)
        tokens=[t.lower() for t in tokens]
        freq_dist=nltk.FreqDist(tokens)
    for lang in languages:
        scorelist[lang]=0
    for word in freq_dist.keys()[0:20]:
        if word in dictiolist[lang]:
            scorelist[lang]+=1
    return scorelist

def whichLanguage(scorelist):
    maximum=0
    for item in scorelist:
        value = scorelist[item]
        if maximum<value:
            maximum = value
            lang = item
    return lang

当我运行scoreFunction(“hillo我的名字是osfar,我是天才”) 我知道错误了 回溯(最近一次调用):文件“”,第1行,中

^{pr2}$

Tags: inimportlangforvaluedefitemword
1条回答
网友
1楼 · 发布于 2024-10-03 17:24:18

您的问题在以下代码块中:

for word in freq_dist.keys()[0:20]:
    if word in dictiolist[lang]:
    scorelist[lang]+=1

您在for循环中使用了变量lang,但没有在任何地方定义它。这意味着它的值是未定义的;碰巧,它的值是“”(空字符串),因为这是它在上一个for循环中拥有的最后一个值。在

你的意思是:

^{pr2}$

顺便说一句,有一种更简单的方法来做你想做的事情:使用计数器。有关详细信息,请参见http://docs.python.org/2.7/library/collections.html#counter-objects。在

相关问题 更多 >