计算每行和每个索引的周年日期

2024-06-26 12:38:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我的合同第一行开始于2008年02月11日,结束于2011年03月28日。我想计算从20082011之间每年的周年日期(适用于144份及其他合同)

主要目标是检查每一行合同的周年日期是否正确,如果不正确,则计算它并用正确的值更新开始和结束

*这是我的熊猫数据框架,涉及两个合同“144”“150”

^{tb1}$

*这是我想要得到的数据帧:

^{tb2}$

这是我的代码,仅当我只有一个Num\u合同,但不与2个或更多Num\u合同一起工作时才有效


    
for NUM_contrat in df['NUM_contrat'].unique():

    for i in df['index'].unique():

       for index,row in df.iterrows():

           if df.iloc[index]['end'] > df.iloc[index]['anniversary']:

               df1=pd.DataFrame(df.iloc[index]).transpose() 

               df.loc[index, 'end'] = df.loc[index, 'anniversary']

               df= pd.concat([df,df1],ignore_index=True).sort_values(['start','end']).reset_index(drop=True)

               df.loc[index+1,'start'] = df.loc[index,'anniversary']
               df.loc[index+1,'anniversary'] = df.loc[index,'anniversary'] + relativedelta(years=1)
                    
    return df

Tags: 数据indfforindexnumlocend
2条回答
  • 为开始日期生成日期范围
  • explode()它可以生成所需的行
  • 计算结束周年
df = pd.read_csv(io.StringIO("""index   NUM_contrat start   end anniversary quantity
0   144 2008-02-11  2011-03-28  2009-02-11  550
1   144 2011-03-28  2011-09-19  2012-02-11  550
2   150 2011-09-19  2012-02-10  2012-09-19  900
3   150 2012-02-10  2013-02-10  2013-09-19  900"""), sep="\t", index_col=0)

# cleanup - make sure dates are dates
df.start = pd.to_datetime(df.start)
df.end = pd.to_datetime(df.end)
df.anniversary = pd.to_datetime(df.anniversary)
df
# # generate a date range for start, based on end date
df2 = (df.assign(start=df.apply(lambda r: pd.date_range(r.start, 
                                                 periods=((r.end.year+1)-r.start.year), 
                                                 freq=pd.DateOffset(years=1)), axis=1))
# explode the start dates
 .explode("start")
# calc end and anivversary dates
 .assign(end=lambda dfa: np.where(dfa.start.dt.year==dfa.end.dt.year,dfa.end, dfa.start+pd.DateOffset(years=1)),
        anniversary=lambda dfa: dfa.start+pd.DateOffset(years=1))
# anniversary is always the one from the first instance of the contract
 .assign(anniversary=lambda dfa: dfa.groupby(["NUM_contrat",dfa.start.dt.year])["anniversary"].transform("first"))
)

df2

输出

^{tb1}$

更新:根据您的评论,每个合同有多行

下面的代码片段可以回答您的问题

>>> data
   NUM_contrat      start        end anniversary  quantity
0          144 2008-02-11 2011-03-28  2009-02-11       550
1          144 2011-03-28 2011-09-19  2012-02-11       550
2          150 2011-09-19 2012-02-10  2012-09-19       900
3          150 2012-02-10 2013-02-10  2013-09-19       900
for _, sr in data.loc[data["anniversary"] < data["end"]].iterrows():
    df = sr.to_frame().transpose()
    periods = sr["end"].year - sr["start"].year
    idx = pd.date_range(sr["anniversary"], periods=periods, freq="Y")
    idx += pd.DateOffset(days=sr["anniversary"].day, months=sr["anniversary"].month - 1)
    data = pd.concat([data, df.loc[df.index.repeat(len(idx))].assign(anniversary=idx)])
  NUM_contrat      start        end anniversary quantity
0         144 2008-02-11 2011-03-28  2009-02-11      550
0         144 2008-02-11 2011-03-28  2010-02-11      550
0         144 2008-02-11 2011-03-28  2011-02-11      550
0         144 2008-02-11 2011-03-28  2012-02-11      550
1         144 2011-03-28 2011-09-19  2012-02-11      550
2         150 2011-09-19 2012-02-10  2012-09-19      900
3         150 2012-02-10 2013-02-10  2013-09-19      900

相关问题 更多 >