如何计算pos/neg/neu得分句子

2024-10-06 16:23:32 发布

男 | 程序猿一只，喜欢编程写python代码。

我有以下代码：

def sent_dictionary__sentence_(text):
    token_dictionary = {}
    neg_count = []
    pos_count = []
    neu_count = []
    lexicon_keys = lexicon_dictionary.keys()
    sentences = sent_tokenize(text)
    for sentence in sentences:
            tokens = apply_lemmatization(remove_punctuation(word_tokenize(str(sentence))))
            for token in tokens:
                if token in lexicon_keys:
                    token_dictionary[token] = token_dictionary.get(token, 0) + lexicon_dictionary[token]
                    new_token_dict = token_dictionary
                    txt_values = list(new_token_dict.values())
                    if sum(txt_values) == 0:
                        neu_count.append(sentence)
                    if sum(txt_values) > 0:
                        pos_count.append(sentence)
                    if sum(txt_values) < 0:
                        neg_count.append(sentence)
    print(len(neg_count))
    print(len(neu_count))
    print(len(pos_count))

我想做的是：

我有一个文本文档，我想把它按句子分开，然后按单词分开，这样我就可以看到句子中的单词是否在SentiWordNet字典中。如果单词在SentiWord dictionary中。。。将分数与单词相关联，如果同一单词有多个，则将分数相加。你知道吗

然后根据这些信息，我想把每个句子中单词的分数加起来，打印出有多少个肯定句，多少个否定句，还有多少个中性句。你知道吗

当我试着在一个有152个句子的文本文档中使用它时，我得到了1473个肯定的答案， 31表示阴性， 1203为空档，这显然是错误的。我该怎么做才能让这段代码真正起作用？你知道吗

提前谢谢。你知道吗

编辑：

我想我把它修好了。如果有人好奇，这里有更新的代码：

def sent_dictionary__sentence_(text):
    token_dictionary = {}
    neg_count = []
    pos_count = []
    neu_count = []
    lexicon_keys = lexicon_dictionary.keys()
    sentences = sent_tokenize(text)
    for sentence in sentences:
            tokens = apply_lemmatization(remove_punctuation(word_tokenize(str(sentence))))
            for token in tokens:
                if token in lexicon_keys:
                    token_dictionary[token] = token_dictionary.get(token, 0) + lexicon_dictionary[token]
                    new_token_dict = token_dictionary
                    txt_values = list(new_token_dict.values())
    for sentence in sentences:
            if sum(txt_values) == 0:
                neu_count.append(sentence)
            if sum(txt_values) > 0:
                pos_count.append(sentence)
            if sum(txt_values) < 0:
                neg_count.append(sentence)
    print("neg count: " + str(len(neg_count)))
    print("neu count: " + str(len(neu_count)))
    print("pos count: " + str(len(pos_count)))

编辑：

不，我没修好。我用“我有能力”。我无法测试。我能肯定的是。我不能说是否定的。所以输出应该是：

neg count: 1
pos count: 1

而是说：pos count:2

编辑：

我得到了这条建议，但我不太清楚如何实施。你知道吗

“我认为您的代码最大的问题是，文本值在每次迭代后都不会被保存在任何地方，因此其中的最终值将是循环最后一次迭代时的值。您应该创建三个变量，每个音调计数一个，for循环的每个周期递增一个。然后您甚至不需要第二个for循环，只需打印出值即可。”

Tags： in pos txt token for dictionary if count

0条回答

目前没有回答

如何计算pos/neg/neu得分句子

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何计算pos/neg/neu得分句子

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >