如何使用pandas中的userdefined函数返回基于列值和时间戳的值

Date Content Cleaned-content Sentiment 11/12/2020 abb bbc abb Bad 12/10/2020 xyz xxy Good 11/24/2020 tyu yuu Neutral 12/16/2020 iop yui Bad

Date Content Cleaned-content Sentiment Day_Sentiment 11/12/2020 abb bbc abb Bad Bad 12/10/2020 xyz xxy Good Bad 11/24/2020 tyu yuu Neutral Bad 12/16/2020 iop yui Bad Bad

df = input_data.join(results) def compare_def(df): no.bad_senti= df.loc[df['Sentiment'] == 'Bad'] no.neut_senti = df.loc[df['Sentiment'] == 'Neutral'] no.good_senti= df.loc[df['Sentiment'] == 'Good'] if ((no.bad_senti> no.good_senti) & (no.bad_senti> no.neut_senti)): output = 'Bad' elif ((no.good_senti> no.bad_senti) & (no.good_senti> no.neut_senti)): output= 'Good' elif ((no.neut_senti> no.bad_senti) & (no.neut_senti> no.good_senti)): output= 'Neutral' elif no.good_senti== no.bad_senti: output= 'Neutral' elif no.bad_senti== no.neut_senti: output= 'bad' elif no.good_senti== no.neut_senti: output= 'good' else: output= 'Neutral' return output df['Day_Sentiment'] = output

1条回答

网友

1楼 · 发布于 2024-06-24 13:26:28

您的代码有几个问题。开始变量bad、good和amp；neut是包含字符串变量的不同长度的熊猫系列。然后，您尝试评估并执行几个条件测试，例如if ((bad> good) & (bad> neut)，它会生成ValueError。我不太确定您试图实现什么逻辑，但以下模板可能会有所帮助：

def compare_data(row):
    value = 'Good'
    # The logic here escapes me
    # Evaluate the row contents of row[Sentiment] and modify value
    return value  

df["Day Sentiment"]= df.apply(lambda row: compare_data(row), axis= 1)

收益率：

    Date    Content Cleaned-content Sentiment   Day Sentiment
0   11/12/2020  abb bbc abb Bad Good
1   12/10/2020  xyz xxy Good    Good
2   11/24/2020  tyu yuu Neutral Good
3   12/16/2020  iop yui Bad Good

相关问题更多 >

编程相关推荐

热门问题

热门文章