根据条件更新行并删除分组数据中的几行

2024-09-19 23:35:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我有下面的dataframe,它有4列。我们叫它df

    ID  Start transfer  Finish transfer Ward
0   7685933 04/11/2015 12:07    05/11/2015 12:49    General surgery
1   7685933 05/11/2015 12:49    11/11/2015 14:42    Anestesiology
2   7685933 11/11/2015 14:42    11/11/2015 16:12    Anestesiology
3   7685933 11/11/2015 16:12    18/11/2015 21:24    General surgery
4   7685933 18/11/2015 21:24    02/01/2016 06:45    ICU
5   7690142 06/11/2015 17:24    30/11/2015 18:11    Internal Medicine
6   7690142 30/11/2015 18:11    02/12/2015 17:04    Internal Medicine
7   7690142 02/12/2015 17:04    03/12/2015 20:40    Internal Medicine
8   7690142 03/12/2015 20:40    11/01/2016 18:00    Internal Medicine
9   7691888 08/11/2015 16:28    16/11/2015 17:11    Internal Medicine
10  7691888 16/11/2015 17:11    20/11/2015 18:13    Internal Medicine
11  7691888 20/11/2015 18:13    04/01/2016 18:02    Internal Medicine
12  7691888 04/01/2016 18:02    04/01/2016 21:13    Internal Medicine

现在,我想根据“ID”列对数据进行分组,然后查找类似的连续病房,其中病房的“Finish Transfer”与下一个连续的类似病房名称的“Start Transfer”相同。一旦解决了这个问题,我需要从最后一个连续病房行复制Finish transfer条目,并用该值更新该病房的第一个条目。例如,索引1和索引2处的行1和行2都具有类似的ward,如果查看行1的Finish Transfer条目(index1),它类似于行2的Start Transfer(index2)。病房也一样。我想要的是这个连续数据只有一行,其中start transfer是来自row1的,Finish transfer是来自row2的

我希望以下内容作为输出(可能在新的数据帧中):

    ID  Start transfer  Finish transfer Ward
0   7685933 04/11/2015 12:07    05/11/2015 12:49    General surgery
1   7685933 05/11/2015 12:49    11/11/2015 16:12    Anestesiology
2   7685933 11/11/2015 16:12    18/11/2015 21:24    General surgery
3   7685933 18/11/2015 21:24    02/01/2016 06:45    ICU
4   7690142 06/11/2015 17:24    11/01/2016 18:00    Internal Medicine
5   7691888 08/11/2015 16:28    04/01/2016 21:13    Internal Medicine

事先谢谢你的帮助


Tags: 数据id条目starttransfergeneralinternalsurgery
1条回答
网友
1楼 · 发布于 2024-09-19 23:35:01

IIUC公司

df.groupby(['ID','Ward']).agg({'Start transfer':'first','Finish transfer':'last'}).reset_index()
Out[151]: 
        ID               Ward    Start transfer   Finish transfer
0  7685933      Anestesiology  05/11/2015 12:49  11/11/2015 16:12
1  7685933    General surgery  04/11/2015 12:07  18/11/2015 21:24
2  7685933                ICU  18/11/2015 21:24  02/01/2016 06:45
3  7690142  Internal Medicine  06/11/2015 17:24  11/01/2016 18:00
4  7691888  Internal Medicine  08/11/2015 16:28  04/01/2016 21:13

更新

df.assign(Key=(df.Ward.shift()!=df.Ward).cumsum()).groupby(['ID','Ward','Key']).agg({'Start transfer':'first','Finish transfer':'last'}).reset_index().sort_values('Key')
Out[181]: 
        ID               Ward  Key    Start transfer   Finish transfer
1  7685933    General surgery    1  04/11/2015 12:07  05/11/2015 12:49
0  7685933      Anestesiology    2  05/11/2015 12:49  11/11/2015 16:12
2  7685933    General surgery    3  11/11/2015 16:12  18/11/2015 21:24
3  7685933                ICU    4  18/11/2015 21:24  02/01/2016 06:45
4  7690142  Internal Medicine    5  06/11/2015 17:24  11/01/2016 18:00
5  7691888  Internal Medicine    5  08/11/2015 16:28  04/01/2016 21:13

相关问题 更多 >