如何在每天之后汇总总数

2024-06-28 19:50:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一张像这样的桌子

      Date        Hospital Total Operations Errors
2001-10-01      Hospital A             1101      0
2001-10-01      Hospital B               32      0
2001-10-02      Hospital A             1184      0
2001-10-02      Hospital B               74      0
2001-10-03      Hospital A             1350      0
2001-10-03      Hospital B               72      0
2001-10-04      Hospital A             1364      0
2001-10-04      Hospital B              232      0
2001-10-05      Hospital A             1345      0
2001-10-05      Hospital B              212      0

我想计算每家医院每天的总数,并将它们添加到特定日期的行中,这样我的最终表格可以如下所示

      Date        Hospital Total Operations Errors Aggregated Operations Aggregated Errors
2001-10-01      Hospital A             1101      0                  1101                 0
2001-10-01      Hospital B               32      0                    32                 0
2001-10-02      Hospital A             1184      0                  2285                 0
2001-10-02      Hospital B               74      0                   106                 0
2001-10-03      Hospital A             1350      0                  3635                 0
2001-10-03      Hospital B               72      0                   178                 0
2001-10-04      Hospital A             1364      0                  4999                 0
2001-10-04      Hospital B              232      0                   410                 0
2001-10-05      Hospital A             1345      0                  6344                 0
2001-10-05      Hospital B              212      0                   622                 0

在python中有这样做的方法吗?我们衷心感谢您的帮助


Tags: 方法date表格totaloperations桌子总数errors
3条回答
df.join(df.sort_values(by=['Date','Hospital']).groupby(['Hospital']).cumsum(), rsuffix=' Aggregated')

输出

         Date    Hospital  Total Operations  Errors  Total Operations Aggregated  Errors Aggregated
0  2001-10-01  Hospital A              1101       0                         1101                  0
1  2001-10-01  Hospital B                32       0                           32                  0
2  2001-10-02  Hospital A              1184       0                         2285                  0
3  2001-10-02  Hospital B                74       0                          106                  0
4  2001-10-03  Hospital A              1350       0                         3635                  0
5  2001-10-03  Hospital B                72       0                          178                  0
6  2001-10-04  Hospital A              1364       0                         4999                  0
7  2001-10-04  Hospital B               232       0                          410                  0
8  2001-10-05  Hospital A              1345       0                         6344                  0
9  2001-10-05  Hospital B               212       0                          622                  0

您可以使用groupby()cumsum(),正如我之前在评论中建议的那样:

df = pd.DataFrame({'Date':['2001-10-01','2001-10-01','2001-10-02','2001-10-02','2001-10-03','2001-10-03','2001-10-04','2001-10-04'],
                   'Hospital':['Hospital A','Hospital B','Hospital A','Hospital B','Hospital A','Hospital B','Hospital A','Hospital B'],
                   'Total Operations':[1101,32,1184,74,1350,72,1364,232],
                   'Errors':[0,0,0,0,0,0,0,0]})
df['Aggreated Operations'] = df.groupby(['Hospital'])['Total Operations'].cumsum()
df['Aggreated Erros'] = df.groupby(['Hospital'])['Errors'].cumsum()
print(df)

这将输出:

         Date    Hospital  ...  Aggreated Operations  Aggreated Erros
0  2001-10-01  Hospital A  ...                  1101                0
1  2001-10-01  Hospital B  ...                    32                0
2  2001-10-02  Hospital A  ...                  2285                0
3  2001-10-02  Hospital B  ...                   106                0
4  2001-10-03  Hospital A  ...                  3635                0
5  2001-10-03  Hospital B  ...                   178                0
6  2001-10-04  Hospital A  ...                  4999                0
7  2001-10-04  Hospital B  ...                   410                0

这个代码应该可以工作

df['Aggregated Operations'] = df.groupby('Hospital')['Total Operations'].cumsum()

相关问题 更多 >