当Pandas满足一定条件时,如何放弃整个群体

2024-07-03 02:25:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在满足特定条件时删除所有数据组

import pandas as pd


raw_data = {'regiment': ['51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st', '51st'], 
            'trucks': ['MAZ-7310', 'MAZ-7310', 'MAZ-7310', 'MAZ-7310', 'Tatra 810', 'Tatra 810', 'Tatra 810', 'Tatra 810', 'ZIS-150', 'ZIS-150', 'ZIS-150', 'ZIS-150'],
            'drivers': ['MAZ', 'MAZ', 'IVE', 'IVE', 'MAN', 'MAN', 'MERC', 'TATA', 'TATA', 'MAN', 'REN', 'TATA'],


            'counts': [0,0,1,1,0,0,1,0, 1,2,3,4]}


df = pd.DataFrame(raw_data, columns = ['regiment', 'trucks','drivers','counts']) 

   regiment     trucks drivers  counts
0      51st   MAZ-7310     MAZ       0
1      51st   MAZ-7310     MAZ       0
2      51st   MAZ-7310     IVE       1
3      51st   MAZ-7310     IVE       1
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

当驱动程序为MAZcounts == 0时,我试图删除MAZ-7310

所以我跟着这个帖子Pandas groupby and filter

df = df.groupby(['regiment','trucks']).filter(lambda x: ~((x['counts'] == 0) & (x['drivers'] == 'MAZ')).all())

但它似乎没有给我所需要的输出

预期产出

    regiment     trucks drivers  counts
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

如何获得此输出

thx


Tags: dfrawpdcountsmercmandriversren
2条回答

首先,我们分配一个名为m的新列,它是drivers is MAZcounts is 0行的布尔值

然后我们使用GroupBy并得到any m is True所在的所有组

然后我们使用布尔索引来得到与~相反的结果

使用的方法:

mask = (df.assign(m=(df['drivers'].eq('MAZ') & ~df['counts']))
          .groupby(['regiment','trucks'])['m'].transform('any')
       )

df[~mask]

   regiment     trucks drivers  counts
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

根据需要的输出,需要使用any而不是all。因此,只需将代码中的all更改为any

df_final = df.groupby(['regiment','trucks']).filter(lambda x: ~((x['counts'] ==0) 
                                                    & (x['drivers'] == 'MAZ')).any())

Out[234]:
   regiment     trucks drivers  counts
4      51st  Tatra 810     MAN       0
5      51st  Tatra 810     MAN       0
6      51st  Tatra 810    MERC       1
7      51st  Tatra 810    TATA       0
8      51st    ZIS-150    TATA       1
9      51st    ZIS-150     MAN       2
10     51st    ZIS-150     REN       3
11     51st    ZIS-150    TATA       4

相关问题 更多 >