我有三个数据帧:
maindf = pd.DataFrame({'Risk':['AB','AC','AD'],'amnt':[100,200,300]})
maindf
Out[4]:
Risk amnt
0 AB 100
1 AC 200
2 AD 300
disc = pd.DataFrame({'Risk':['AB','AB','AB','AC','AC','AD'], 'discPerc':[0.4,0.5,0.1,0.5,0.5,1]})
disc
Out[7]:
Risk discPerc
0 AB 0.4
1 AB 0.5
2 AB 0.1
3 AC 0.5
4 AC 0.5
5 AD 1.0
ops = pd.DataFrame({'Risk':['AB','AB','AC','AC','AD','AD'], 'opsPerc':[0.5,0.5,0.4,0.6,0.2,0.8]})
ops
Out[9]:
Risk opsPerc
0 AB 0.5
1 AB 0.5
2 AC 0.4
3 AC 0.6
4 AD 0.2
5 AD 0.8
我想把数据帧连接到maindf,这样如果我需要按列“Risk”分组,我会将discPerc和opsPerc相加为1(因为它们在disc/ops数据帧中)
简单的双左联接会导致:
merged = pd.merge(maindf,disc,on='Risk',how='left')
merged = pd.merge(merged,ops, on = 'Risk', how = 'left')
merged
Out[19]:
Risk amnt discPerc opsPerc
0 AB 100 0.4 0.5
1 AB 100 0.4 0.5
2 AB 100 0.5 0.5
3 AB 100 0.5 0.5
4 AB 100 0.1 0.5
5 AB 100 0.1 0.5
6 AC 200 0.5 0.4
7 AC 200 0.5 0.6
8 AC 200 0.5 0.4
9 AC 200 0.5 0.6
10 AD 300 1.0 0.2
11 AD 300 1.0 0.8
将其分组可得出:
merged.groupby('Risk').sum()
Out[20]:
amnt discPerc opsPerc
Risk
AB 600 2.0 3.0
AC 800 2.0 2.0
AD 600 2.0 1.0
相反,我希望合并的数据帧看起来像:
Risk amnt discPerc opsPerc
0 AB 100 0.4 nan
1 AB 100 0.5 nan
2 AB 100 0.1 nan
3 AB 100 nan 0.5
4 AB 100 nan 0.5
6 AC 200 0.5 nan
7 AC 200 0.5 nan
8 AC 200 nan 0.4
9 AC 200 nan 0.6
10 AD 300 1.0 nan
11 AD 300 nan 0.2
12 AD 300 nan 0.8
这样我可以把百分比加起来等于1。你知道吗
可以合并
disc
和ops
,然后与原始数据帧合并:相关问题 更多 >
编程相关推荐