来自scipy的jarque_bera计算

ranking Q1 Q2 Q3 Q4 Date 2009-12-29 nan nan nan nan 2009-12-30 0.12 -0.21 -0.36 -0.39 2009-12-31 0.05 0.09 0.06 -0.02 2010-01-01 nan nan nan nan 2010-01-04 1.45 1.90 1.81 1.77 ... ... ... ... ... 2020-10-13 -0.67 -0.59 -0.63 -0.61 2020-10-14 -0.05 -0.12 -0.05 -0.13 2020-10-15 -1.91 -1.62 -1.78 -1.91 2020-10-16 1.21 1.13 1.09 1.37 2020-10-19 -0.03 0.01 0.06 -0.02

from scipy import stats def stat(x): return pd.Series([x.mean(), np.sqrt(x.var()), stats.jarque_bera(x), ], index=['Return', 'Volatility', 'JB P-Value' ]) data.apply(stat)

1条回答

网友

1楼 · 发布于 2024-09-28 01:23:56

我试图复制，通过复制上面提供的10行数据，该函数对我来说运行良好。这看起来像是一个数据输入问题，其中某些列的值似乎少于该pd.Series的索引（实际上是len(data[col]) > len(data[col].index)）。您可以通过运行一个简单的“调试”函数来尝试找出它是哪一列，例如：

for col in data.columns: 
    if len(data[col].values) != len(data[col].index):
        print(f"Column {col} has more/less values than the index")

然而，Scipy上的Jarque-Bera test documentation表示x可以是任何“类似数组”的结构，因此您不需要传递pd.Series，这可能会使您遇到缺少值等问题。本质上，您只需传递一个值列表并计算它们的JB测试统计和p值

因此，我将把你的函数修改为

def stat(x):
    return pd.Series([x.mean(),
                      np.sqrt(x.var()),
                      stats.jarque_bera(x.dropna().values), # drop NaN and get numpy array instead of pd.Series
                      ],
                     index=['Return',
                            'Volatility',
                            'JB P-Value'
                            ])

相关问题更多 >

编程相关推荐

热门问题

热门文章