基于多个日期时间比较创建组

2024-06-23 19:00:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图创建一个基于列的列,并根据一个日期列与其他三个日期列的比较,用一个值填充它

数据帧df的示例如下所示。所有显示的日期都已转换为pd.to_datetime,这导致了许多NaT值,因为个人没有进步

    1st_date     2nd_date        3rd_date     action_date
    2015-10-05   NaT             NaT          2015-12-03 
    2015-02-27   2015-03-14      2015-03-15   2015-04-08 
    2015-03-07   2015-03-27      2015-03-28   2015-03-27 
    2015-01-05   2015-01-20      2015-01-21   2015-05-20 
    2015-01-05   2015-01-20      2015-01-21   2015-09-16 
    2015-05-23   2015-06-18      2015-06-19   2015-07-01 
    2015-03-03   NaT             NaT          2015-07-23 
    2015-03-03   NaT             NaT          2015-11-14 
    2015-06-05   2015-06-19      2015-06-20   2015-10-24 
    2015-10-08   2015-10-21      2015-10-22   2015-12-22 

我试图创建第五列,其中包含action_date列与前三个日期列1st_date, 2nd_date, 3rd_date的比较结果(或组)

我试图用一个字符串填充第五列action_group,将每个日期分配给一个组

势函数(和预期输出)的伪代码如下:if action_date > 1st_date and < 2nd_date then action_group = '1st_action_group'

action_date2nd_date3rd_date需要相同的比较,这将导致action_group列中的输出2nd_action_group

最后,如果action_date大于3rd_date,则action_group将被赋值3rd_action_group

预期输出的示例如下所示

1st_date     2nd_date        3rd_date     action_date  action_group
2015-10-05   NaT             NaT          2015-12-03   1st_action_group
2015-02-27   2015-03-14      2015-03-15   2015-04-08   3rd_action_group
2015-03-07   2015-03-27      2015-03-28   2015-03-27   2nd_action_group
2015-01-05   2015-01-20      2015-01-21   2015-05-20   3rd_action_group
2015-01-05   2015-01-20      2015-01-21   2015-09-16   3rd_action_group
2015-05-23   2015-06-18      2015-06-19   2015-07-01   3rd_action_group
2015-03-03   NaT             NaT          2015-07-23   1st_action_group
2015-03-03   NaT             NaT          2015-11-14   1st_action_group
2015-06-05   2015-06-19      2015-06-20   2015-10-24   3rd_action_group
2015-10-08   2015-10-21      2015-10-22   2015-12-22   3rd_action_group

任何人能提供的任何帮助都将不胜感激


Tags: to数据字符串代码示例dfdatetimedate
1条回答
网友
1楼 · 发布于 2024-06-23 19:00:21
df['action_group'] = np.where(df['action_date']>df['3rd_date'], 
                              '3rd_action_group', 
                               np.where(((df['action_date'] >= df['2nd_date'])&(df['action_date']<df['3rd_date'])), 
                                          '2nd_action_group', 
                                          '1st_action_group'))

你只需要堆叠2个np。在哪里可以得到你想要的结果

    1st_date    2nd_date    3rd_date    action_date action_group
0   2015-10-05     NaT          NaT     2015-12-03  1st_action_group
1   2015-02-27  2015-03-14  2015-03-15  2015-04-08  3rd_action_group
2   2015-03-07  2015-03-27  2015-03-28  2015-03-27  2nd_action_group
3   2015-01-05  2015-01-20  2015-01-21  2015-05-20  3rd_action_group
4   2015-01-05  2015-01-20  2015-01-21  2015-09-16  3rd_action_group
5   2015-05-23  2015-06-18  2015-06-19  2015-07-01  3rd_action_group
6   2015-03-03     NaT          NaT     2015-07-23  1st_action_group
7   2015-03-03     NaT          NaT     2015-11-14  1st_action_group
8   2015-06-05  2015-06-19  2015-06-20  2015-10-24  3rd_action_group
9   2015-10-08  2015-10-21  2015-10-22  2015-12-22  3rd_action_group

相关问题 更多 >

    热门问题