groupby中特定行的差异

2024-09-26 18:06:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个熊猫数据框

df = pd.DataFrame({'Firm': ['Firm1','Firm1','Firm1','Firm1','Firm1','Firm1','Firm2','Firm2','Firm2','Firm2','Firm2','Firm2'],'Location' : ['Country1', 'Country1', 'Country1', 'Country2', 'Country2', 'Country2','Country1', 'Country1', 'Country1', 'Country2', 'Country2', 'Country2'], 'Currency' : ['Curr1', 'Curr2', 'Curr3', 'Curr1', 'Curr2', 'Curr3','Curr1', 'Curr2', 'Curr3', 'Curr1', 'Curr2', 'Curr3'], 'Value' : [100, 105, 110, 100, 95, 120, 95, 110, 115, 105, 120, 90] })

看起来是这样的:

df:

     Firm  Location Currency  Value
0   Firm1  Country1    Curr1    100
1   Firm1  Country1    Curr2    105
2   Firm1  Country1    Curr3    110
3   Firm1  Country2    Curr1    100
4   Firm1  Country2    Curr2     95
5   Firm1  Country2    Curr3    120
6   Firm2  Country1    Curr1     95
7   Firm2  Country1    Curr2    110
8   Firm2  Country1    Curr3    115
9   Firm2  Country2    Curr1    105
10  Firm2  Country2    Curr2    120
11  Firm2  Country2    Curr3     90

现在,我想计算每个公司位置组的Curr3和Curr2(列值)之间的差异,并根据结果更改Curr3的值。结果df应如下所示:

     Firm  Location Currency  Value
0   Firm1  Country1    Curr1    100
1   Firm1  Country1    Curr2    105
2   Firm1  Country1    Curr3      5
3   Firm1  Country2    Curr1    100
4   Firm1  Country2    Curr2     95
5   Firm1  Country2    Curr3     25
6   Firm2  Country1    Curr1     95
7   Firm2  Country1    Curr2    110
8   Firm2  Country1    Curr3      5
9   Firm2  Country2    Curr1    105
10  Firm2  Country2    Curr2    120
11  Firm2  Country2    Curr3    -30

我已经尝试使用.groupby.apply来给出结果,但是我想在原始数据帧中进行转换

df2 = df.groupby(['Firm','Location']).apply(lambda g: g[g.Currency == 'Curr3'].Value.values[0] - g[g.Currency == 'Curr2'].Value.values[0])

df2:

Firm    Location    0
Firm1   Country1    5
Firm1   Country2    25
Firm2   Country1    5
Firm2   Country2    -30

我不知道如何在原始df中执行此操作。我还使用.transform尝试了相同的方法,但是它会产生一个错误:

df2 = df.groupby(['Firm','Location']).transform(lambda g: g[g.Currency == 'Curr3'].Value.values[0] - g[g.Currency == 'Curr2'].Value.values[0])

AttributeError: ("'Series' object has no attribute 'Currency'", 'occurred at index Currency')

----基于Erfan解决方案的更新:

newvals = (
    df.where(df['Currency'].isin(['Curr2', 'Curr3']))
      .groupby(['Firm', 'Location'])['Value'].diff()
)
df['Value'] = newvals.fillna(df['Value'])

但是,如果df看起来像这样(货币未排序),则解决方案不再有效(因为diff()只计算与上一个值的差值)

    Firm    Location    Currency    Value
0   Firm1   Country1    Curr2   100
1   Firm1   Country1    Curr1   105
2   Firm1   Country1    Curr3   110
3   Firm1   Country2    Curr3   100
4   Firm1   Country2    Curr2   95
5   Firm1   Country2    Curr1   120
6   Firm2   Country1    Curr1   95
7   Firm2   Country1    Curr2   110
8   Firm2   Country1    Curr3   115
9   Firm2   Country2    Curr2   105
10  Firm2   Country2    Curr3   120
11  Firm2   Country2    Curr1   90

->;结果:

    Firm    Location    Currency    Value
0   Firm1   Country1    Curr2   100.0
1   Firm1   Country1    Curr1   105.0
2   Firm1   Country1    Curr3   10.0
3   Firm1   Country2    Curr3   100.0
4   Firm1   Country2    Curr2   -5.0
5   Firm1   Country2    Curr1   120.0
6   Firm2   Country1    Curr1   95.0
7   Firm2   Country1    Curr2   110.0
8   Firm2   Country1    Curr3   5.0
9   Firm2   Country2    Curr2   105.0
10  Firm2   Country2    Curr3   15.0
11  Firm2   Country2    Curr1   90.0

现在,不再是每次计算Curr3和Curr2之间的差值并替换Curr3的值


Tags: dfvaluelocationcurrencyvaluesdf2groupbyfirm
1条回答
网友
1楼 · 发布于 2024-09-26 18:06:18

使用DataFrame.whereSeries.isinGroupBy.diffSeries.fillna

首先我们用{}将所有{}转换为{},然后对{}和{}进行分组,并计算{}中的差异

newvals = (
    df.where(df['Currency'].isin(['Curr2', 'Curr3']))
      .groupby(['Firm', 'Location'])['Value'].diff()
)
df['Value'] = newvals.fillna(df['Value'])
     Firm  Location Currency  Value
0   Firm1  Country1    Curr1  100.0
1   Firm1  Country1    Curr2  105.0
2   Firm1  Country1    Curr3    5.0
3   Firm1  Country2    Curr1  100.0
4   Firm1  Country2    Curr2   95.0
5   Firm1  Country2    Curr3   25.0
6   Firm2  Country1    Curr1   95.0
7   Firm2  Country1    Curr2  110.0
8   Firm2  Country1    Curr3    5.0
9   Firm2  Country2    Curr1  105.0
10  Firm2  Country2    Curr2  120.0
11  Firm2  Country2    Curr3  -30.0

相关问题 更多 >

    热门问题