Nltk模块找不到正确的英语单词python

2024-10-01 00:22:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我下载了python的“NLTK”库的“words”和“wordnet”:

import nltk
from nltk.corpus import words
from nltk.corpus import wordnet
nltk.download('words')
nltk.download('wordnet')

检查列表中的单词是否为英语

但是,当运行脚本时,它不会将任何单词识别为英语

这是我的剧本:

samplewords=['accident scene','a%32','j & quod','accident season','academic discount','academic diary','academic dictionary']

for word in samplewords:
    if word in words.words():
        print('English',word)
    else:
        print('Not English',word)

for word in samplewords:
    if not wordnet.synsets(word):
        print('Not english',word)
    else:
        print('English',word)

以下是我从以上两方面得到的:

Not english accident scene
Not english a%32
Not english j & quod
Not english accident season
Not english academic discount
Not english academic diary
Not english academic dictionary

我的预期结果:

    English accident scene
    Not english a%32
    Not english j & quod
    English accident season
    English academic discount
    English academic diary
    English academic dictionary

我怎样才能确保图书馆能认出那些是英语单词


Tags: importenglishnotscenewordnetwordseasonwords
1条回答
网友
1楼 · 发布于 2024-10-01 00:22:44

words()包含语料库中的单个单词,而不是单词搭配

你需要的是这样的东西来检查每个单词是否都在words.words()(然而,这将把不存在的搭配,比如‘dictionary season’也归类为英语单词):

for word in samplewords:
    if all([w in words.words() for w in word.split()]):
        print('English',word)
    else:
        print('Not English',word)

结果:

English accident scene
Not English a%32
Not English j & quod
English accident season
English academic discount
English academic diary
English academic dictionary

相关问题 更多 >