我有一个熊猫数据框
df = pd.DataFrame({'Firm': ['Firm1','Firm1','Firm1','Firm1','Firm1','Firm1','Firm2','Firm2','Firm2','Firm2','Firm2','Firm2'],'Location' : ['Country1', 'Country1', 'Country1', 'Country2', 'Country2', 'Country2','Country1', 'Country1', 'Country1', 'Country2', 'Country2', 'Country2'], 'Currency' : ['Curr1', 'Curr2', 'Curr3', 'Curr1', 'Curr2', 'Curr3','Curr1', 'Curr2', 'Curr3', 'Curr1', 'Curr2', 'Curr3'], 'Value' : [100, 105, 110, 100, 95, 120, 95, 110, 115, 105, 120, 90] })
看起来是这样的:
df:
Firm Location Currency Value
0 Firm1 Country1 Curr1 100
1 Firm1 Country1 Curr2 105
2 Firm1 Country1 Curr3 110
3 Firm1 Country2 Curr1 100
4 Firm1 Country2 Curr2 95
5 Firm1 Country2 Curr3 120
6 Firm2 Country1 Curr1 95
7 Firm2 Country1 Curr2 110
8 Firm2 Country1 Curr3 115
9 Firm2 Country2 Curr1 105
10 Firm2 Country2 Curr2 120
11 Firm2 Country2 Curr3 90
现在,我想计算每个公司位置组的Curr3和Curr2(列值)之间的差异,并根据结果更改Curr3的值。结果df应如下所示:
Firm Location Currency Value
0 Firm1 Country1 Curr1 100
1 Firm1 Country1 Curr2 105
2 Firm1 Country1 Curr3 5
3 Firm1 Country2 Curr1 100
4 Firm1 Country2 Curr2 95
5 Firm1 Country2 Curr3 25
6 Firm2 Country1 Curr1 95
7 Firm2 Country1 Curr2 110
8 Firm2 Country1 Curr3 5
9 Firm2 Country2 Curr1 105
10 Firm2 Country2 Curr2 120
11 Firm2 Country2 Curr3 -30
我已经尝试使用.groupby
和.apply
来给出结果,但是我想在原始数据帧中进行转换
df2 = df.groupby(['Firm','Location']).apply(lambda g: g[g.Currency == 'Curr3'].Value.values[0] - g[g.Currency == 'Curr2'].Value.values[0])
df2:
Firm Location 0
Firm1 Country1 5
Firm1 Country2 25
Firm2 Country1 5
Firm2 Country2 -30
我不知道如何在原始df中执行此操作。我还使用.transform
尝试了相同的方法,但是它会产生一个错误:
df2 = df.groupby(['Firm','Location']).transform(lambda g: g[g.Currency == 'Curr3'].Value.values[0] - g[g.Currency == 'Curr2'].Value.values[0])
AttributeError: ("'Series' object has no attribute 'Currency'", 'occurred at index Currency')
----基于Erfan解决方案的更新:
newvals = (
df.where(df['Currency'].isin(['Curr2', 'Curr3']))
.groupby(['Firm', 'Location'])['Value'].diff()
)
df['Value'] = newvals.fillna(df['Value'])
但是,如果df看起来像这样(货币未排序),则解决方案不再有效(因为diff()只计算与上一个值的差值)
Firm Location Currency Value
0 Firm1 Country1 Curr2 100
1 Firm1 Country1 Curr1 105
2 Firm1 Country1 Curr3 110
3 Firm1 Country2 Curr3 100
4 Firm1 Country2 Curr2 95
5 Firm1 Country2 Curr1 120
6 Firm2 Country1 Curr1 95
7 Firm2 Country1 Curr2 110
8 Firm2 Country1 Curr3 115
9 Firm2 Country2 Curr2 105
10 Firm2 Country2 Curr3 120
11 Firm2 Country2 Curr1 90
->;结果:
Firm Location Currency Value
0 Firm1 Country1 Curr2 100.0
1 Firm1 Country1 Curr1 105.0
2 Firm1 Country1 Curr3 10.0
3 Firm1 Country2 Curr3 100.0
4 Firm1 Country2 Curr2 -5.0
5 Firm1 Country2 Curr1 120.0
6 Firm2 Country1 Curr1 95.0
7 Firm2 Country1 Curr2 110.0
8 Firm2 Country1 Curr3 5.0
9 Firm2 Country2 Curr2 105.0
10 Firm2 Country2 Curr3 15.0
11 Firm2 Country2 Curr1 90.0
现在,不再是每次计算Curr3和Curr2之间的差值并替换Curr3的值
使用
DataFrame.where
、Series.isin
、GroupBy.diff
和Series.fillna
:首先我们用{}将所有{}转换为{},然后对{}和{}进行分组,并计算{}中的差异
相关问题 更多 >
编程相关推荐