Python scikitlearn PCA在历史VaR计算中增加缺失数据

2024-09-30 08:27:28 发布

您现在位置：Python中文网/ 问答频道 /正文

9225

网友

男 | 程序猿一只，喜欢编程写python代码。

我有一组时间序列数据，我想用来计算一个大型股票投资组合的历史VaR。在

投资组合中有大量的工具缺少时间序列数据，我需要一种系统的方法来生成合理的缺失值。在

在有足够的数据来计算因子暴露的情况下，我考虑使用PCA来增加丢失的数据，并尝试了以下Python实现（Carol Alexander）：

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline

# get index time series returns
df = pd.read_csv('IndexData.csv',index_col=0)
df = df.fillna(method='ffill').pct_change().dropna(how='all') 
rics = df.columns.tolist()

# add 'missing' data
df['GDAXI'].iloc[0:30] = None

# see stack overflow reference below
pipeline = make_pipeline(StandardScaler(), PCA(n_components = len(rics) - 1))

# Step 1 - PCA for sub-period with GDAXI data - training period
dfSub = df.iloc[30:]
pipeline.fit(np.array(dfSub))
sub_components = pipeline._final_estimator.components_

# Step 2 - PCA for entire period with no GDAXI - 
dfFull = df.loc[:,df.columns != 'GDAXI']
full_transf = pipeline.fit_transform(np.array(dfFull))

# Step 3 - Apply missing asset factor exposures in stage 1 to stage 2
#          to augment missing data
synthetic =  np.dot(full_transf, sub_components[:,rics.index('GDAXI')])

# rescaling??
df['GDAXI'].iloc[0:30]  = synthetic[0:30]

所附示例假定索引数据.csv包含包括DAX在内的多个欧洲指数的价格数据。在实际操作中，我希望在合理的高度相关的国家/部门篮子上进行操作。在

问题

我将/应该使用什么西格玛和平均值来重新调整计算的回报？在
在另一个Python库中是否已经有这样的功能？在

[尝试2]

参考文献

Sebastian Raschka - Implementing PCA in Python Step-By-Step

stackOverflow - How to normalize with pca and scikit-learn

scikitlearn.org

Tags： csv 数据 from import df index pipeline step

0条回答

目前没有回答

Python scikitlearn PCA在历史VaR计算中增加缺失数据

问题

参考文献

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python scikitlearn PCA在历史VaR计算中增加缺失数据

问题

参考文献

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >