我通过重采样聚合数据,如下所示:
import quandl
import numpy as np
data = quandl.get("WIKI/KO", trim_start = "2000-12-12", trim_end = "2014-12-30")
data = data.ix[:40, ['Close']]
data['SIGNAL'] = np.random.randint(0,3, size=len(data))
data['SIGNAL'] = np.where((data['SIGNAL'] == 2), -1, data['SIGNAL'] )
data['SIGNAL'] = np.where((data.index >= '2001-02-01'), 0, data['SIGNAL'] )
data['WIN'] = 10
print(data.to_string())
WinPerYear = data['WIN'].loc[(data['SIGNAL'] != 0)].resample('M').sum()
CntPerYear = data['WIN'].loc[(data['SIGNAL'] != 0)].resample('M').count()
print(WinPerYear.to_string())
print(CntPerYear.to_string())
正确地,我得到以下结果:
Close SIGNAL WIN
Date
[...]
2001-01-17 57.94 -1 10
2001-01-18 57.13 -1 10
2001-01-19 55.81 -1 10
2001-01-22 55.69 -1 10
2001-01-23 56.88 0 10
2001-01-24 58.06 -1 10
2001-01-25 58.63 -1 10
2001-01-26 57.94 -1 10
2001-01-29 57.12 0 10
2001-01-30 57.91 0 10
2001-01-31 58.00 1 10
2001-02-01 57.44 0 10
2001-02-02 57.74 0 10
2001-02-05 59.20 0 10
2001-02-06 59.42 0 10
2001-02-07 60.00 0 10
2001-02-08 60.61 0 10
Date
2000-12-31 60
2001-01-31 160
Freq: M
Date
2000-12-31 6
2001-01-31 16
Freq: M
有没有一种简单的方法,即不改变所有的子集逻辑,也为所有不匹配的月份添加行?例如2001/02没有匹配项,因此我希望两个聚合都使用0,例如:
Date
2000-12-31 60
2001-01-31 160
2001-02-31 0
Freq: M
Date
2000-12-31 6
2001-01-31 16
2001-01-31 0
Freq: M
非常感谢并致以最良好的祝愿, e
我通过添加一个新的计算字段来解决这个问题
在我聚合之后就被丢弃了
谢谢,干杯, E
相关问题 更多 >
编程相关推荐