使用条件按列对值进行分组

df = pd.DataFrame({'ids': {0: 'D1791272223', 1: 'V25369085223', 2: 'V25117230523', 3: 'V25104327323', 4: 'V24862169823', 5: 'P3944221523', 6: 'V24776335823', 7: 'V24722584123', 8: 'V24716191923', 9: 'V24575876123', 10: 'V24791923'}, 'status': {0: 'Resolved', 1: 'Resolved', 2: 'Resolved', 3: 'Resolved', 4: 'Open', 5: 'Open', 6: 'Closed', 7: 'Resolved', 8: 'Resolved', 9: 'Open', 10: 'Resolved'}, 'action': {0: 'Comment', 1: 'Implementation', 2: 'Comment', 3: 'Implementation', 4: 'Comment', 5: 'Implementation', 6: 'Comment', 7: 'Comment', 8: 'Implementation', 9: 'Comment', 10: 'Implementation'}, 'sev': {0: 3, 1: 2, 2: 1, 3: 3, 4: 4, 5: 4, 6: 3, 7: 2, 8: 2, 9: 1, 10: 5}})

| ids | status | action | sev | |--------------|----------|----------------|-----| | D1791272223 | Resolved | Comment | 3 | | V25369085223 | Resolved | Implementation | 2 | | V25117230523 | Resolved | Comment | 1 | | V25104327323 | Resolved | Implementation | 3 | | V24862169823 | Open | Comment | 4 | | P3944221523 | Open | Implementation | 4 | | V24776335823 | Closed | Comment | 3 | | V24722584123 | Resolved | Comment | 2 | | V24716191923 | Resolved | Implementation | 2 | | V24575876123 | Open | Comment | 1 | | V24791923 | Resolved | Implementation | 5 |

| sev | count | Open count | Closed and Resolved count | |--------|-------|------------|-----------------------------| | 1&2 | 5 | 1 | 4 | | 3 | 3 | 0 | 3 | | 4&5 | 3 | 2 | 1 |

1条回答

网友

1楼 · 发布于 2024-09-30 06:14:41

主要问题是，您需要将严重性级别聚合到较少的类别中，这可以通过pd.cut来完成，因为sev是数字的，并且您希望在连续的时间间隔内执行。如果不是数字或间隔不连续（例如1&；4、2、3&；5），则需要df.replace和映射字典

然后，可以使用df.pivot_table或使用groupby/unstack“手动”进行重塑。我更喜欢groupby，因为它在其他情况下更灵活

df['sev_group'] = pd.cut(df['sev'], bins=[0, 2, 3, 5],
    labels=['1&2', '3', '4&5'])


summary = df.groupby(['sev_group', 'status']).size().unstack()
# or
# summary = df.pivot_table(values='ids', index='sev_group',
    # columns='status', aggfunc='count', fill_value=0)


summary['count'] = summary.sum(axis=1)
summary['Closed/Resolved'] = summary['Closed'] + summary['Resolved']
summary = summary[['count', 'Open', 'Closed/Resolved']]

输出

status     count  Open  Closed/Resolved
sev_group
1&2            5     1                4
3              3     0                3
4&5            3     2                1

相关问题更多 >

编程相关推荐

热门问题

热门文章