DataFrame根据group by多个条件添加新列值

2024-10-03 00:19:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个如下的数据帧

       Color  Month  Quantity
index                        
0          1      1     34047
1          1      2     36654
2          2      3     37291
3          2      4     35270
4          3      5     35407
5          1     12      9300

我想向这个数据帧添加一个额外的列PrevoiousMonthQty,其中Qty列中的填充值是通过逻辑分组的(Color, Month),而Month就是Previous Month

我期望的目标数据帧如下所示

enter image description here

一些逻辑解释可以看作是

enter image description here

任何帮助都将不胜感激

多谢各位


Tags: 数据目标index逻辑quantitycolorqtymonth
3条回答

以下是一种在查找上个月后使用^{}^{}的方法:

prev_month = pd.to_datetime(df['Month'],format='%m').sub(pd.Timedelta(1,unit='m')).dt.month

m = df.set_index(['Color','Month'])['Quantity']

final = (df.assign(Prev_Month_Value=pd.MultiIndex.from_arrays([df['Color'],prev_month])
                                                          .map(m).fillna(0)))

#To assign into the existing df,use below code instead of df.assign() which returns a copy
#df['Previous Month Value'] = (pd.MultiIndex.from_arrays([df['Color'],prev_month])
#                                                              .map(m).fillna(0)

输出:

       Color  Month  Quantity  Prev_Month_Value
index                                          
0          1      1     34047            9300.0
1          1      2     36654           34047.0
2          2      3     37291               0.0
3          2      4     35270           37291.0
4          3      5     35407               0.0
5          1     12      9300               0.0

详情:

Step1 : Find previous month by converting Month column to datetime and subtract 1 month using pd.Timedelta.

Step2: Create a multiindex series with Quantity as value and Color and Month as index.

Step3: Create a MultiIndex using Color and prev_month series and map it back as new column (also fill nan with 0)

使用^{}重塑数据帧并通过^{}添加完整月份:

df1 = df.pivot('Color','Month','Oty').reindex(columns=range(1,13))
print (df1)
Month        1        2        3        4        5   6   7   8   9  10  11  \
Color                                                                        
1      34047.0  36654.0      NaN      NaN      NaN NaN NaN NaN NaN NaN NaN   
2          NaN      NaN  37291.0  35270.0      NaN NaN NaN NaN NaN NaN NaN   
3          NaN      NaN      NaN      NaN  35407.0 NaN NaN NaN NaN NaN NaN   

Month      12  
Color          
1      9300.0  
2         NaN  
3         NaN  

然后将^{}^{}一起使用:

s = pd.DataFrame(np.roll(df1.to_numpy(), 1, axis=1), 
                 index=df1.index, 
                 columns=df1.columns).stack().rename('Previous Month')

df = df.join(s, on=['Color','Month']).fillna({'Previous Month':0})
print (df)
   Index  Color  Month    Oty  Previous Month
0      0      1      1  34047          9300.0
1      1      1      2  36654         34047.0
2      2      2      3  37291             0.0
3      3      2      4  35270         37291.0
4      4      3      5  35407             0.0
5      5      1     12   9300             0.0

下面是另一种使用^{}的方法-我们将在prv_month键上“合并”,我们将^{}内联:

df['PreviousQty'] = (df.assign(prv_month=df['Month'].sub(1).where(lambda x: x!=0, 12))
                     .merge(df,
                            how='left',
                            left_on=['Color', 'prv_month'],
                            right_on=['Color', 'Month'])['Qty_y'].fillna(0))

[外]

   Color  Month    Qty  PreviousQty
0      1      1  34047       9300.0
1      1      2  36654      34047.0
2      2      3  37291          0.0
3      2      4  35270      37291.0
4      3      5  35407          0.0
5      1     12   9300          0.0

相关问题 更多 >