我下载了python的“NLTK”库的“words”和“wordnet”:
import nltk
from nltk.corpus import words
from nltk.corpus import wordnet
nltk.download('words')
nltk.download('wordnet')
检查列表中的单词是否为英语
但是,当运行脚本时,它不会将任何单词识别为英语
这是我的剧本:
samplewords=['accident scene','a%32','j & quod','accident season','academic discount','academic diary','academic dictionary']
for word in samplewords:
if word in words.words():
print('English',word)
else:
print('Not English',word)
for word in samplewords:
if not wordnet.synsets(word):
print('Not english',word)
else:
print('English',word)
以下是我从以上两方面得到的:
Not english accident scene
Not english a%32
Not english j & quod
Not english accident season
Not english academic discount
Not english academic diary
Not english academic dictionary
我的预期结果:
English accident scene
Not english a%32
Not english j & quod
English accident season
English academic discount
English academic diary
English academic dictionary
我怎样才能确保图书馆能认出那些是英语单词
words()
包含语料库中的单个单词,而不是单词搭配你需要的是这样的东西来检查每个单词是否都在
words.words()
(然而,这将把不存在的搭配,比如‘dictionary season’也归类为英语单词):结果:
相关问题 更多 >
编程相关推荐