在数据帧中寻找对应项进行计算

2024-09-28 19:30:32 发布

您现在位置:Python中文网/ 问答频道 /正文

两个数据帧如下,我想计算相关系数

当两列都用实际值完成时,它可以正常工作。但当它们不是时,在计算相关系数时取零

例如,艾迪生和卡登的重量是0。杰克和诺亚没有重量。我想把它们排除在外进行计算

(在尝试中,似乎只考虑相同的长度,即自动排除杰克和诺亚-是吗?)

如何只包含非零值的人进行计算

谢谢你

import pandas as pd

Weight = {'Name': ["Abigail","Addison","Aiden","Amelia","Aria","Ava","Caden","Charlotte","Chloe","Elijah"], 
'Weight': [10, 0, 12, 20, 25, 10, 0, 18, 16, 13]}

df_wt = pd.DataFrame(Weight)

Score = {'Name': ["Abigail","Addison","Aiden","Amelia","Aria","Ava","Caden","Charlotte","Chloe","Elijah", "Jack", "Noah"], 
'Score': [360, 476, 345, 601, 604, 313, 539, 531, 507, 473, 450, 470]}

df_sc = pd.DataFrame(Score)

print df_wt.Weight.corr(df_sc.Score)

Tags: namedfpdscoreweight重量ariacharlotte
2条回答

掩蔽和获取非零值和公共索引:

df_wt.set_index('Name', inplace=True)
df_sc.set_index('Name', inplace=True)

mask = df_wt['Weight'].ne(0)
common_index = df_wt.loc[mask, :].index
df_wt.loc[common_index, 'Weight'].corr(df_sc.loc[common_index, 'Score'])

0.923425144491911

如果两个数据帧都包含零,则:

mask1 = df_wt['Weight'].ne(0)
mask2 = df_sc['Score'].ne(0)
common_index = df_wt.loc[mask1, :].index.intersection(df_sc.loc[mask2, :].index)
df_wt.loc[common_index, 'Weight'].corr(df_sc.loc[common_index, 'Score'])

使用^{}添加新列,按^{}删除0行,最后在同一数据帧中应用解决方案:

df_wt['Score'] = df_wt['Name'].map(df_sc.set_index('Name')['Score'])

df_wt = df_wt[df_wt['Weight'].ne(0)]
print (df_wt)
        Name  Weight  Score
0    Abigail      10    360
2      Aiden      12    345
3     Amelia      20    601
4       Aria      25    604
5        Ava      10    313
7  Charlotte      18    531
8      Chloe      16    507
9     Elijah      13    473

print (df_wt.Weight.corr(df_wt.Score))
0.923425144491911

相关问题 更多 >