保留元组的大Pandas分组

2024-09-29 07:29:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个看起来像这样的数据帧(实际上有35列和更多元组,但下面是相关列:

     leg_side  leg_quantity expiration product  change_type  
0        None          None       None      ZQ     inserted  
1        None          None       None      HG     inserted  
2        None          None       None      PL     inserted  
3        None          None       None      SI     inserted  
4        None          None       None      ZQ     inserted  
5        None          None       None      PL     inserted  
6        None          None       None      ZW     inserted  
7        None          None       None      SI     inserted  
8        None          None       None      ZQ     updated  
9        None          None       None      SI     inserted  
10       None          None       None      ZC     updated
..        ...           ...        ...     ...          ...  
970      None          None       None      OZ     inserted  
971      None          None       None      OZ     deleted  
972      None          None       None      OZ     updated  
973      None          None       None      ZC     inserted  
974      None          None       None      OZ     inserted  
975      None          None       None      ZC     inserted  
976      None          None       None      OZ     inserted

现在我想做的是按产品分组,但不一定是SQL意义上的分组。我想做的是将所有具有类似产品的元组聚合在一起,然后按change\u类型进行子聚合,得到如下df:

     leg_side  leg_quantity expiration product  change_type  
0        None          None       None      ZQ     inserted
4        None          None       None      ZQ     inserted
8        None          None       None      ZQ     updated 
1        None          None       None      HG     inserted
2        None          None       None      PL     inserted
5        None          None       None      PL     inserted
3        None          None       None      SI     inserted
7        None          None       None      SI     inserted
9        None          None       None      SI     inserted
6        None          None       None      ZW     inserted
...
973      None          None       None      ZC     inserted
975      None          None       None      ZC     inserted
10       None          None       None      ZC     updated
970      None          None       None      OZ     inserted
974      None          None       None      OZ     inserted
976      None          None       None      OZ     inserted
972      None          None       None      OZ     updated
971      None          None       None      OZ     deleted

上面的数据帧是这样组织的:具有相同产品名称的所有元组都在一起,然后那些具有相同更改类型的组中的所有元组都在一起(最好是按插入、更新、删除的顺序)。如果我使用groupby(),那么元组将被消除。我只想有一种分组的感觉。我该怎么做?你知道吗


Tags: 数据nonezcchangesidequantity元组pl
1条回答
网友
1楼 · 发布于 2024-09-29 07:29:32

您可以使用Categoricalset自定义顺序。然后^{}排序数据:

df['change_type'] = df['change_type'].astype('category')
                                     .cat
                                     .set_categories(["inserted","updated","deleted"], ordered=True)

df = df.groupby('product').apply(lambda x: x.sort_values('change_type'))
                          .reset_index(drop=True)
print df

   leg_side leg_quantity expiration product change_type
0      None         None       None      HG    inserted
1      None         None       None      OZ    inserted
2      None         None       None      OZ    inserted
3      None         None       None      OZ    inserted
4      None         None       None      OZ     updated
5      None         None       None      OZ     deleted
6      None         None       None      PL    inserted
7      None         None       None      PL    inserted
8      None         None       None      SI    inserted
9      None         None       None      SI    inserted
10     None         None       None      SI    inserted
11     None         None       None      ZC    inserted
12     None         None       None      ZC    inserted
13     None         None       None      ZC     updated
14     None         None       None      ZQ    inserted
15     None         None       None      ZQ    inserted
16     None         None       None      ZQ     updated
17     None         None       None      ZW    inserted

相关问题 更多 >