NLTK WordNetElementMatizer中的多线程?

2024-06-18 03:42:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用多线程来加快进程。我正在使用WordNetElemmatizer对单词进行柠檬化,这些单词可以被sentiwordnet进一步用来计算文本的情感。我使用WordNetElementMatizer的情感分析功能如下:

import nltk
from nltk.corpus import sentiwordnet as swn

def SentimentA(doc, file_path):
    sentences = nltk.sent_tokenize(doc)
    # print(sentences)
    stokens = [nltk.word_tokenize(sent) for sent in sentences]
    taggedlist = []
    for stoken in stokens:
        taggedlist.append(nltk.pos_tag(stoken))
    wnl = nltk.WordNetLemmatizer()
    score_list = []
    for idx, taggedsent in enumerate(taggedlist):
        score_list.append([])
        for idx2, t in enumerate(taggedsent):
            newtag = ''
            lemmatized = wnl.lemmatize(t[0])
            if t[1].startswith('NN'):
                newtag = 'n'
            elif t[1].startswith('JJ'):
                newtag = 'a'
            elif t[1].startswith('V'):
                newtag = 'v'
            elif t[1].startswith('R'):
                newtag = 'r'
            else:
                newtag = ''
            if (newtag != ''):
                synsets = list(swn.senti_synsets(lemmatized, newtag))

                score = 0
                if (len(synsets) > 0):
                    for syn in synsets:
                        score += syn.pos_score() - syn.neg_score()
                    score_list[idx].append(score / len(synsets))
    return SentiCal(score_list)

在运行了4个线程之后,前3个线程出现以下错误,最后一个线程工作正常。在

^{pr2}$

我已经尝试在本地导入NLTK包,如NLTKissue 并尝试了这个page给出的解决方案。在


Tags: inforifsentenceslistsentscoreelif
1条回答
网友
1楼 · 发布于 2024-06-18 03:42:38

快速破解:

import nltk
from nltk.corpus import sentiwordnet as swn
# Do this first, that'll do something eval() 
# to "materialize" the LazyCorpusLoader
next(swn.all_senti_synsets()) 

# Your other code here. 

更多细节稍后。。。还在打字吗

相关问题 更多 >