如何按日期范围聚合总和并联接

2024-09-28 21:39:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个客户订阅表a,大部分是1年合同。该期间从每月的第一天开始,到每月的最后一天结束

,companyid,startdate,enddate
16216,10020659,2015-08-01,2016-07-31
23926,10020659,2016-08-01,2017-07-31
30078,10020659,2017-08-01,2018-07-31

表b用于跟踪客户的活动:

,companyid,isomonthofyear,count1,count2,count3
6325,10020659,2015-06-01,348,346,2
23605,10020659,2015-07-01,509,499,10
465310,10020659,2015-08-01,561,543,18
473875,10020659,2015-09-01,467,453,14
661421,10020659,2015-10-01,528,518,10
515050,10020659,2015-11-01,586,578,8
508636,10020659,2015-12-01,520,507,13
728064,10020659,2016-01-01,602,599,3
728024,10020659,2016-02-01,604,596,8
532500,10020659,2016-03-01,967,958,9
528642,10020659,2016-04-01,725,710,15
496834,10020659,2016-05-01,811,800,11
496701,10020659,2016-06-01,677,661,16
496682,10020659,2016-07-01,694,676,18
465301,10020659,2016-08-01,870,858,12

我只想汇总订阅期间的活动总数(count1-3),并加入表a:

,companyid,startdate,enddate,sum_count1,sum_count2, sum_count3
 10020659.   2015-08-01,2016-07-31      10 20 30
 10020659,   2016-08-01,2017-07-31      14 589 29
,10020659,2017-08-01,2018-07-31         20 40 30

请注意,存在非订阅活动和不同的公司。我想我们可以把结束日期转换成一个月的第一天


Tags: 客户公司汇总sum总数count1startdateenddate
1条回答
网友
1楼 · 发布于 2024-09-28 21:39:02

首先确保类似日期的列为数据类型datetime:

df1['startdate']=pd.to_datetime(df1['startdate'])
df1['enddate']=pd.to_datetime(df1['enddate'])
df2['isomonthofyear']=pd.to_datetime(df2['isomonthofyear'])

最后:

out=df1.merge(df2,on='companyid',how='right')
#merging both df's on 'companyid' so to checking our criteria
m=out['isomonthofyear'].between(out['startdate'],out['enddate'])
#checking if it is under the subscription period or not
out=out[m].rename(columns={'counts':'count3'}).groupby('companyid').sum().add_prefix('sum_').reset_index()
#Filtering rows(selecting only those which are in subscription period) and then grouping them on companyid and performing sum
out=df1.merge(out,on='companyid',how='left')
#Finally merging the calculated sum with df1

out的输出:

    companyid   startdate   enddate     sum_count1  sum_count2  sum_count3
0   10020659    2015-08-01  2016-07-31     8612         8457    155
1   10020659    2016-08-01  2017-07-31     8612         8457    155
2   10020659    2017-08-01  2018-07-31     8612         8457    155

相关问题 更多 >