如何基于Pandas中选定的行筛选器添加列?

2024-05-06 10:13:56 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨,我想根据学生最喜欢的科目的当前分数+分数给他们一个最终分数

import pandas as pd
new_data = [['tom', 31, 50, 30, 20, 'English'], ['nick', 30, 42, 23, 21, 'Math'], ['juli', 39, 14, 40, 38, 'Science']]
df = pd.DataFrame(new_data, columns = ['Name','Current_Score','English','Science','Math','Favourite_Subject'])
for subj in df['Favourite_Subject'].unique():
    mask = (df['Favourite_Subject'] == subj)
    df['Final_Score'] = df[mask].apply(lambda row: row['Current_Score'] + row[subj], axis=1)

    Name    Score   English Science Math    Favourite_Subject   Final_Score
0   tom       31       50    30      20         English          NaN
1   nick      30       42    23      21         Math             NaN
2   juli      39       14    40      38         Science          79.0

当我应用上述函数时,我在“Final_Score”列的其他两个条目中得到了NaN,如何在不覆盖NaN的情况下得到以下结果?谢谢


    Name    Score   English Science Math    Favourite_Subject   Final_Score
0   tom       31       50    30      20         English          81
1   nick      30       42    23      21         Math             51
2   juli      39       14    40      38         Science          79

Tags: namedfenglishmathnan分数nickfinal
3条回答

您不需要循环,可以将其直接应用于数据帧:

import pandas as pd
new_data = [['tom', 31, 50, 30, 20, 'English'], ['nick', 30, 42, 23, 21, 'Math'], ['juli', 39, 14, 40, 38, 'Science']]
df = pd.DataFrame(new_data, columns = ['Name','Current_Score','English','Science','Math','Favourite_Subject'])
df['Final_Score'] = df.apply(lambda x: x['Current_Score'] + x[x['Favourite_Subject']], axis=1)

我们可以使用lookup找到与Favourite_Subject对应的分数,然后将它们与Current_Score相加,以计算Final_Score

i = df.columns.get_indexer(df['Favourite_Subject'])
df['Final_Score'] = df['Current_Score'] + df.values[df.index, i]

   Name  Current_Score  English  Science  Math Favourite_Subject Final_Score
0   tom             31       50       30    20           English          81
1  nick             30       42       23    21              Math          51
2  juli             39       14       40    38           Science          79

您可以使用^{}on axis=1并从列Favourite_Subject的列值中获取列标签,以获取相应列的值。然后,用df['Current_Score']将结果添加到列Current_Score,如下所示:

df['Final_Score'] = df['Current_Score'] + df.apply(lambda x: x[x['Favourite_Subject']], axis=1)

结果:

print(df)

   Name  Current_Score  English  Science  Math Favourite_Subject  Final_Score
0   tom             31       50       30    20           English           81
1  nick             30       42       23    21              Math           51
2  juli             39       14       40    38           Science           79

相关问题 更多 >