我有两个数据帧,如下所示:
df1=
date company userDomain keyword pageViews category
2015-12-02 1-800 Contacts glasses.com SAN 2 STORAGE
2015-12-02 1-800 Contacts rhgi.com SAN 3 STORAGE
2015-12-02 100 Percent Fun dialogdesign.ca SAN 1 STORAGE
2015-12-02 101netlink 101netlink.com SAN 8 STORAGE
2015-12-02 1020 nlc.bc.ca SAN 4 STORAGE
df2=
Outcome Job Title Wave
Created Opportunity IT Manager 1.0
Closed Out Prospect/Contact Infrastructure Manager 1.0
NaN IT Director 1.0
NaN Supervisor Technical Support 1.0
Created Opportunity Director of IT Services 1.0
Wave Date userDomain
2016-02-16 15:07:05 dialogdesign.ca
2016-02-16 15:07:05 rhgi.com
2016-02-16 15:07:05 surefire.com
2016-02-16 15:07:05 isd2144.org
2016-02-16 15:07:05 nlc.bc.ca
我想在df1
中添加一个名为wave_date
的列,其中日期从df2['Wave Date']
开始,所有df1['userDomain']
都在df2['userDomain']
如果两个帧中的userDomain
不匹配,则值应为nan
。我很抱歉,如果这是一个非常幼稚的问题,但我对我的失败感到沮丧。我做的是这样的:
df1['wave_date'] = df1.apply(lambda x: df2['Wave Date'] if x['userDomain'].isin(df2['userDomain']) else np.nan)
我不停地
IndexError: ('userDomain', 'occurred at index date') Can you please point out the correct to do it? Thanks a lot
相关问题 更多 >
编程相关推荐