Pandas的条件合并

proc sql; create table temp_mf3 as select a.*, b.mret from header_mf as a, ret_mf as b where a.crsp_fundno=b.crsp_fundno and ((year(a.caldt)=year(b.caldt) and month(a.caldt)>month(b.caldt) ) or (year(a.caldt)=(year(b.caldt)+1) and month(a.caldt)<=month(b.caldt) )); quit;

1条回答

网友

1楼 · 发布于 2024-06-28 19:46:39

抱歉，如果这个回复来得太晚了。我不认为你想要有条件的合并（至少在我正确理解情况的情况下）。我认为只要在['fundno','caldt']上合并header_mf和ret mf，然后使用pandas中的shift操作符创建过去返回的列，就可以得到想要的结果。在

所以我认为你的数据基本上如下所示：

import pandas as pd
header = pd.read_csv('header.csv')
print header

    fundno       caldt  foo
0        1  1986-06-30  100
1        1  1986-07-31  110
2        1  1986-08-29  120
3        1  1986-09-30  115
4        1  1986-10-31  110
5        1  1986-11-28  125
6        1  1986-12-31  137
7        2  1986-06-30  130
8        2  1986-07-31  204
9        2  1986-08-29  192
10       2  1986-09-30  180
11       2  1986-10-31  200
12       2  1986-11-28  205
13       2  1986-12-31  205

ret_mf = pd.read_csv('ret_mf.csv')
print ret_mf

    fundno       caldt  mret
0        1  1986-06-30  0.05
1        1  1986-07-31  0.01
2        1  1986-08-29 -0.01
3        1  1986-09-30  0.10
4        1  1986-10-31  0.04
5        1  1986-11-28 -0.02
6        1  1986-12-31 -0.06
7        2  1986-06-30 -0.04
8        2  1986-07-31  0.03
9        2  1986-08-29  0.07
10       2  1986-09-30  0.00
11       2  1986-10-31 -0.05
12       2  1986-11-28  0.09
13       2  1986-12-31  0.04

显然，头文件中可能有很多变量（除了我自己编的foo变量）。但是，如果这基本上捕获了数据的性质，那么我认为您可以在['fundno','caldt']上合并，然后使用shift：

^{pr2}$

现在可以创建过去的返回变量。因为我创建了一个如此小的示例面板，所以我只做过去3个月的回报：

for lag in range(1,4):
    good = mf['fundno'] == mf['fundno'].shift(lag)
    mf['ret' + str(lag)] = mf['mret'].shift(lag).where(good)
print mf

    fundno       caldt  foo  mret  ret1  ret2  ret3
0        1  1986-06-30  100  0.05   NaN   NaN   NaN
1        1  1986-07-31  110  0.01  0.05   NaN   NaN
2        1  1986-08-29  120 -0.01  0.01  0.05   NaN
3        1  1986-09-30  115  0.10 -0.01  0.01  0.05
4        1  1986-10-31  110  0.04  0.10 -0.01  0.01
5        1  1986-11-28  125 -0.02  0.04  0.10 -0.01
6        1  1986-12-31  137 -0.06 -0.02  0.04  0.10
7        2  1986-06-30  130 -0.04   NaN   NaN   NaN
8        2  1986-07-31  204  0.03 -0.04   NaN   NaN
9        2  1986-08-29  192  0.07  0.03 -0.04   NaN
10       2  1986-09-30  180  0.00  0.07  0.03 -0.04
11       2  1986-10-31  200 -0.05  0.00  0.07  0.03
12       2  1986-11-28  205  0.09 -0.05  0.00  0.07
13       2  1986-12-31  205  0.04  0.09 -0.05  0.00

如果我误解了你的数据，我很抱歉。在

相关问题更多 >

编程相关推荐

热门问题

热门文章