重命名Pandas中的多个列值

2024-07-03 07:25:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我将客户评论存储在“情绪”一栏中。这是data['Sentiment'].unique()的结果:

array(['Negative', 'Positive', '?', 'Neutral', 'nan', 'positive',
       'neutral', 'negative', 'Neg', 'ppos', 'ne'], dtype=object)

我试图将这些值分为“正”、“负”和“中性”,并创建了三个映射列表:

positive = ['Positive','positive', 'ppos']
negative = ['Negative', 'negative', 'Neg']
neutral = ['Neutral', 'neutral', 'ne']

其他的都应该是NAn。我尝试了iterrows(),大致如下:

for idx, row in data.iterrows():
    if row['Sentiment'].isin(positive):
        row['Sentiment'] == 'positive'
               ...

不起作用,似乎也没有效率。我尝试过使用级数和布尔运算,这似乎是一种很有前途的方法,但我真的想知道是否有一些简洁的解决方法


Tags: 方法data客户rownesentimentnegativepositive
3条回答

使用numpy.select。传递条件作为第一个参数,与条件对应的值作为第二个参数,默认值与任何条件都不匹配

import numpy as np

conditions = [
    df['Sentiment'].isin(positive),
    df['Sentiment'].isin(neutral),
    df['Sentiment'].isin(negative)
]
values = ['positive', 'neutral', 'negative']

df['Sentiment'] = np.select(conditions, values, np.nan)

您可以创建一个字典,将旧值与新值配对,并通过pandas map替换感伤列的内容

  #list of old and new values
old_values = [['Positive','positive', 'ppos'],
              ['Negative', 'negative', 'Neg'],
              ['Neutral', 'neutral', 'ne']]

new_values = ['positive','negative','neutral']

merge = zip(new_values,old_values)

#create mapping
d = {}
for new, old in merge:
    for i in old:
        d[i] = new

print(d)

{'Positive': 'positive',
 'positive': 'positive',
 'ppos': 'positive',
 'Negative': 'negative',
 'negative': 'negative',
 'Neg': 'negative',
 'Neutral': 'neutral',
 'neutral': 'neutral',
 'ne': 'neutral'}

#apply mapping to series:
df.Sentiment.map(d)

使用pd.apply

def sentiment_group(sentiment):
    if sentiment in ['Positive','positive', 'ppos']:
        return 'positive'
    if sentiment in ['Negative', 'negative', 'Neg']:
        return 'negative'
    if sentiment in ['Neutral', 'neutral', 'ne']:
        return 'neutral'
    else:
        return sentiment

data['sentiment_group'] = data['Sentiment'].apply(sentiment_group)

相关问题 更多 >