维德词典的结果加起来不等于1.0

2024-10-03 13:24:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我的数据是来自Stocktwits的tweets,我尝试使用python中的维德库进行情绪分析。 问题是,正、中性和负字段的总和不等于1.0。相反,它们加起来是2.0

{'neg': 0.0, 'neu': 2.0, 'pos': 0.0, 'compound': 0.0}

这正常吗


Tags: 数据postweets情绪总和中性compoundneg
1条回答
网友
1楼 · 发布于 2024-10-03 13:24:18

是的,这很正常。{a1}显示了类似的结果:

VADER is smart, handsome, and funny.              - {'pos': 0.746, 'compound': 0.8316, 'neu': 0.254, 'neg': 0.0}
VADER is smart, handsome, and funny!              - {'pos': 0.752, 'compound': 0.8439, 'neu': 0.248, 'neg': 0.0}
...
VADER is not smart, handsome, nor funny.            - {'pos': 0.0, 'compound': -0.7424, 'neu': 0.354, 'neg': 0.646}

The pos , neu , and neg scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation). These are the most useful metrics if you want multidimensional measures of sentiment for a given sentence.

您可能想使用compound分数:

The compound score is computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want a single unidimensional measure of sentiment for a given sentence. Calling it a 'normalized, weighted composite score' is accurate.

It is also useful for researchers who would like to set standardized thresholds for classifying sentences as either positive, neutral, or negative.

相关问题 更多 >