Pandas数据帧数学

2024-09-25 16:33:50 发布

您现在位置:Python中文网/ 问答频道 /正文

完全重写了原始问题

我从csv文件“CloseWeight4.csv”读取原始数据

df=pd.read_csv('CloseWeights4.csv')
Date      Symbol    ClosingPrice    Weight
3/1/2010    OGDC    116.51         0.1820219
3/2/2010    OGDC    117.32         0.1820219
3/3/2010    OGDC    116.4          0.1820219
3/4/2010    OGDC    116.58         0.1820219
3/5/2010    OGDC    117.61         0.1820219
3/1/2010    WTI      78.7          0.5348142
3/2/2010    WTI      79.68         0.5348142
3/3/2010    WTI      80.87         0.5348142
3/4/2010    WTI      80.21         0.5348142
3/5/2010    WTI      81.5          0.5348142
3/1/2010    FX       85.07         0.1312427
3/2/2010    FX       85.1077       0.1312427
3/3/2010    FX       85.049        0.1312427
3/4/2010    FX       84.9339       0.1312427
3/5/2010    FX       84.8          0.1312427
3/1/2010    PIB      98.1596499    0.1519211
3/2/2010    PIB      98.1596499    0.1519211
3/3/2010    PIB      98.1764222    0.1519211
3/4/2010    PIB      98.1770656    0.1519211
3/5/2010    PIB      98.1609364    0.1519211

我从中生成一个数据帧df2

^{pr2}$

由此,我使用以下公式计算收益:

ret=np.log(df2/df2.shift(1))

In [12] ret

Out[12]: 
Symbol            FX      OGDC       PIB       WTI
Date                                              
2010-03-01       NaN       NaN       NaN       NaN
2010-03-02  0.000443  0.006928  0.000000  0.012375
2010-03-03 -0.000690 -0.007873  0.000171  0.014824
2010-03-04 -0.001354  0.001545  0.000007 -0.008195
2010-03-05 -0.001578  0.008796 -0.000164  0.015955

我有df的每一个保安的重量

df3=df.iloc[:,[1,3]].drop_duplicates().reset_index(drop=True)

df3
Out[14]: 
          Weight
Symbol          
OGDC    0.182022
WTI     0.534814
FX      0.131243
PIB     0.151921

我试图得到以下每天的加权回报结果,但不知道如何计算熊猫:

Date        Portfolio_weighted_returns
2010-03-02        0.008174751
2010-03-03        0.006061657
2010-03-04       -0.005002414
2010-03-05        0.009058151
where the Portfolio_weighted_returns of 2010-03-02 is calculated as follows:
0.006928*0.182022+.012375*0.534814+0.000443*0.131243+0*0.151921 = 0.007937512315

然后我需要把这些结果乘以一个衰变因子,其中衰变因子定义为decFac=decay^(t)。使用decay=0.5时,decFac值为:

Date        decFac
2010-03-02  0.0625
2010-03-03  0.125
2010-03-04  0.25
2010-03-05  0.5

然后,我需要将每日投资组合加权收益的平方和乘以相应的decFac

SQRT(Sum(0.008174751^2*.0625+0.006061657^2*.125+(-0.005002414^2)*.25+.009058151^2*.5)) = 0.007487

Tags: csvdfdate收益nanoutsymbolfx
2条回答

你可以这样做:

In [267]: port_ret = ret.dot(df3)

In [268]: port_ret
Out[268]:
              Weight
Date
2010-03-01       NaN
2010-03-02  0.007938
2010-03-03  0.006431
2010-03-04 -0.004278
2010-03-05  0.009902

In [269]: decay = 0.5

In [270]: decay_df = pd.DataFrame({'decFac':decay**np.arange(len(ret), 0, -1)}, index=ret.index)

In [271]: decay_df
Out[271]:
             decFac
Date
2010-03-01  0.03125
2010-03-02  0.06250
2010-03-03  0.12500
2010-03-04  0.25000
2010-03-05  0.50000

In [272]: (port_ret.Weight**2 * decay_df.decFac).sum() ** 0.5
Out[272]: 0.007918790111274962

port_ret.Weight**2 * decay_df.decFac

^{pr2}$
import numpy as np
import pandas as pd

定义变量

^{pr2}$

为了计算每日收益,我使用.shift运算符

# Calculate the daily returns for each security
df_ret = np.log( df2 / df2.shift(1) )
#                   FX      OGDC       PIB       WTI
# Date                                              
# 2010-03-01       NaN       NaN       NaN       NaN
# 2010-03-02  0.000443  0.006928  0.000000  0.012375
# 2010-03-03 -0.000690 -0.007873  0.000171  0.014824
# 2010-03-04 -0.001354  0.001545  0.000007 -0.008195
# 2010-03-05 -0.001578  0.008796 -0.000164  0.015955

您需要将wgtWeight列与ret相乘以得到所需的结果。wgt['Weight']将返回一个pd.Series,它更像是一个一维数组,而不是一个可以被普遍认为是pd.DataFrame的二维数组。在

df_wgt_ret = wgt['Weight'] * df_ret
#                   FX      OGDC           PIB       WTI
# Date                                                  
# 2010-03-01       NaN       NaN           NaN       NaN
# 2010-03-02  0.000081  0.003705  0.000000e+00  0.001880
# 2010-03-03 -0.000126 -0.004210  2.242285e-05  0.002252
# 2010-03-04 -0.000247  0.000826  8.609014e-07 -0.001245
# 2010-03-05 -0.000287  0.004704 -2.156434e-05  0.002424

对列求和(轴=1)得到投资组合回报。注意这将返回一个pd.Series而不是一个数据帧

port_ret = df_wgt_ret.sum(axis=1)
# Date
# 2010-03-01         NaN
# 2010-03-02    0.005666
# 2010-03-03   -0.002061
# 2010-03-04   -0.000664
# 2010-03-05    0.006820

最后,将衰减率乘以投资组合,注意因为操作发生在需要的列上

total_ret = (port_ret * sr_dec).sum()
final_res = total_ret**0.5

唯一的班轮

我假设decFac是一个列名为decFac的数据帧,并使用您定义的df3和{}。在

result = (( (df3.Weight * ret).sum(axis=1)**2 * decFac.decFac ).sum())**.5

相关问题 更多 >