如何在列上复制另一个数据集的粘贴值

2024-09-22 16:32:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有df1

    Id Data    Group_Id
0    1 A         1
1    2 B         2
2    3 B         3
      ...
100  4 A         101
101  5 A         102
      ...

和df2

      Timestamp           Group_Id
2012-01-01 00:00:05.523    1
2013-07-01 00:00:10.757    2
2014-01-12 00:00:15.507.   3
                   ...
2016-03-05 00:00:05.743    101
2017-12-24 00:00:10.407    102
                   ...

我想通过Group_Id匹配这两个数据集,然后只从df2中的Timestamp复制date,并根据相应的Group_Id粘贴到df1中的一个新列,将该列命名为day1。你知道吗

然后,我想在day1旁边再添加6列,将它们命名为day2day7,以第1天为基础,在接下来的6天内。所以看起来像:

    Id Data    Group_Id    day1    day2       day3        ...    day7  
0    1 A         1      2012-01-01 2012-01-02 2012-01-03         ...
1    2 B         2      2013-07-01 2013-07-02 2013-07-03         ...
2    3 B         3      2014-01-12 2014-01-13 2014-01-14         ...
                              ...
100  4 A         101    2016-03-05 2016-03-06 2016-03-07         ...
101  5 A         102    2017-12-24 2017-12-25 2017-12-26         ...
                              ...

谢谢。你知道吗


Tags: 数据iddatadate粘贴group命名基础
2条回答

这里的另一种方法,基本上只是合并dfs,从时间戳中获取日期,并创建6个新列,每次添加一天:

import pandas as pd
df1 = pd.read_csv('df1.csv')
df2 = pd.read_csv('df2.csv')
df3 = df1.merge(df2, on='Group_Id')

df3['Timestamp'] = pd.to_datetime(df3['Timestamp']) #only necessary if not already timestamp
df3['day1'] = df3['Timestamp'].dt.date

for i in (range(1,7)):
    df3['day'+str(i+1)] = df3['day1'] + pd.Timedelta(i,unit='d')

输出:

   Id Data  Group_Id               Timestamp        day1        day2        day3        day4        day5        day6        day7
0   1    A         1 2012-01-01 00:00:05.523  2012-01-01  2012-01-02  2012-01-03  2012-01-04  2012-01-05  2012-01-06  2012-01-07
1   2    B         2 2013-07-01 00:00:10.757  2013-07-01  2013-07-02  2013-07-03  2013-07-04  2013-07-05  2013-07-06  2013-07-07
2   3    B         3 2014-01-12 00:00:15.507  2014-01-12  2014-01-13  2014-01-14  2014-01-15  2014-01-16  2014-01-17  2014-01-18
3   4    A       101 2016-03-05 00:00:05.743  2016-03-05  2016-03-06  2016-03-07  2016-03-08  2016-03-09  2016-03-10  2016-03-11
4   5    A       102 2017-12-24 00:00:10.407  2017-12-24  2017-12-25  2017-12-26  2017-12-27  2017-12-28  2017-12-29  2017-12-30

请注意,我将您的数据帧复制到csv中,并且只有5个实体,因此索引与您的示例不同(即100、101)

如果不需要,可以删除时间戳列

首先我们需要merge在这里

df1=df1.merge(df2,how='left')
s=pd.DataFrame([pd.date_range(x,periods=6,freq ='D') for x in df1.Timestamp],index=df1.index)
s.columns+=1
df1.join(s.add_prefix('Day'))

相关问题 更多 >