我有以下市场数据框架:
DP PE BM CAPE
date
1990-01-31 0.0345 13.7235 0.503474 6.460694
1990-02-01 0.0346 13.6861 0.504719 6.396440
1990-02-02 0.0343 13.7707 0.501329 6.440094
1990-02-05 0.0342 13.7676 0.500350 6.460417
1990-02-06 0.0344 13.6814 0.503550 6.419991
... ... ... ... ...
2015-04-28 0.0201 18.7347 0.346717 26.741581
2015-04-29 0.0202 18.6630 0.348080 26.637641
2015-04-30 0.0205 18.4793 0.351642 26.363959
2015-05-01 0.0204 18.6794 0.347814 26.620701
2015-05-04 0.0203 18.7261 0.346813 26.695087
对于这个时间序列中的每一天,我都想用一个向后看的扩展窗口来计算最大的PCA组件。下面的代码给出了上面的测向:
^{pr2}$我自己也尝试过几种不同的方法,但是似乎没有一种方法允许我访问PCA的特征值和向量,这样我就可以做this post所说的通过保持符号一致来消除噪声。这是我当前PCA值的图形,符号切换是一个非常大的问题:
我的错误PCA计算代码:
window = 252*5
# Initialize an empty df of appropriate size for the output
df_pca = pd.DataFrame( np.zeros((df.shape[0] - window + 1, df.shape[1])) )
# Define PCA fit-transform function
# Note: Instead of attempting to return the result,
# it is written into the previously created output array.
def rolling_pca(window_data):
pca = PCA()
transf = pca.fit_transform(df.iloc[window_data])
df_pca.iloc[int(window_data[0])] = transf[0,:]
return True
# Create a df containing row indices for the workaround
df_idx = pd.DataFrame(np.arange(df.shape[0]))
# Use `rolling` to apply the PCA function
_ = df_idx.rolling(window).apply(rolling_pca)
df = df.reset_index()
df = df.join(pd.DataFrame(df_pca[0]))
df.rename(columns={0: 'PCAprice'}, inplace=True)
df['PCAprice'] = df['PCAprice'].shift(window)
目前没有回答
相关问题 更多 >
编程相关推荐