根据要计数的条件按计数分组

2024-09-24 06:34:07 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有一个df,看起来像这样:


    df = pd.DataFrame({'Data1' : ['A', 'A', 'A', 'B', 'B', 'B'], 
                       'Data2' : [100, 100, 200, 100, 100, 100],
                       'Data3' : [1, 2, 3, 1, 1, 1],
                       'State' : ['On', 'On', 'Off', 'Off', 'On', 'On']})
+-------+-------+-------+-------+
| Data1 | Data2 | Data3 | State |
+-------+-------+-------+-------+
| A     |   100 |     1 | On    |
| A     |   100 |     2 | On    |
| A     |   200 |     3 | Off   |
| B     |   100 |     1 | Off   |
| B     |   100 |     1 | On    |
| B     |   100 |     1 | On    |
+-------+-------+-------+-------+

我想对Data1、Data2进行分组,然后对Data3进行nunique计数,但只对状态值为“on”的一个进行计数

所以我的结果是这样的:

+-------+-------+-------+-------+-------+
| Data1 | Data2 | Data3 | State | Count |
+-------+-------+-------+-------+-------+
| A     |   100 |     1 | On    |     2 |
| A     |   100 |     2 | On    |     2 |
| A     |   200 |     3 | Off   |     0 |
| B     |   100 |     1 | Off   |     1 |
| B     |   100 |     1 | On    |     1 |
| B     |   100 |     1 | On    |     1 |
+-------+-------+-------+-------+-------+

我知道这是错误的,因为它是按状态分组的,但我不知道如何使它只按Data1和Data2分组,而只按State='On'close进行计数

df['Count'] = df.groupby(['Data1', 'Data2', 'State'])['Data3'].transform('nunique')

感谢所有的帮助


Tags: dataframedfoncount错误pd计数state
2条回答

您还可以使用groupby.nunique和左合并来执行布尔掩码:

cols = ['Data1','Data2']
m = df[df['State'].eq("On")].groupby(cols)['Data3'].nunique()
out = (df.merge(m,left_on=cols,right_index=True,how='left',suffixes=('','_counts'))
       .fillna({"Data3_counts":0}))

print(out)

  Data1  Data2  Data3 State  Data3_counts
0     A    100      1    On           2.0
1     A    100      2    On           2.0
2     A    200      3   Off           0.0
3     B    100      1   Off           1.0
4     B    100      1    On           1.0
5     B    100      1    On           1.0

让我们试试reindex

df['Count'] = df[df['State'].eq('On')].groupby(['Data1','Data2'])['Data3'].nunique().reindex(df.Data3).values

相关问题 更多 >