使用HACPanel聚集标准错误时,Statsmodel sandwich“所有组均为空,取滞后”错误

2024-07-01 08:14:54 发布

您现在位置:Python中文网/ 问答频道 /正文

我有如下数据:

^{tb1}$

我在差异回归中运行差异:

results2 = smf.ols("DV ~ C(group, Treatment('control')) * C(pre_period, Treatment(True)) + month + C(year)",
                  df99).fit(cov_type='HAC-Panel', cov_kwds={'groups':df99['account'], 'time':df99['yearmonth'], 'maxlags':35})
print(results2.summary())

我得到下面的错误信息。 我用不同的数据集做同样的事情,这些数据集或多或少是相同的(不同的实验),但没有遇到问题。我的数据清理过程基本相同。此外,就在几天前,这项工作还不错。但现在它突然抛出了这个错误(我确实逆转了我所做的任何更改)

我完全搞不懂这个错误。即使在“sandwich_convariance.py”中搜索此错误消息也不会显示任何信息

此人也有类似问题,但没有提出解决方案:Python statsmodels robust cov_type='hac-panel' issue

  ---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-43-0dcbaef1325b> in <module>
----> 1 results2 = smf.ols("dv ~ C(group, Treatment('control')) * C(pre_period, Treatment(True)) + month + C(year)",
      2                   df99).fit(cov_type='HAC-Panel', cov_kwds={'groups':df99['account'], 'time':df99['yearmonth'], 'maxlags':35})
      3 print(results2.summary())

~/opt/anaconda3/envs/pyr/lib/python3.8/site-packages/statsmodels/regression/linear_model.py in fit(self, method, cov_type, cov_kwds, use_t, **kwargs)
    340 
    341         if isinstance(self, OLS):
--> 342             lfit = OLSResults(
    343                 self, beta,
    344                 normalized_cov_params=self.normalized_cov_params,

~/opt/anaconda3/envs/pyr/lib/python3.8/site-packages/statsmodels/regression/linear_model.py in __init__(self, model, params, normalized_cov_params, scale, cov_type, cov_kwds, use_t, **kwargs)
   1584                     use_t = use_t_2
   1585                 # TODO: warn or not?
-> 1586             self.get_robustcov_results(cov_type=cov_type, use_self=True,
   1587                                        use_t=use_t, **cov_kwds)
   1588         for key in kwargs:

~/opt/anaconda3/envs/pyr/lib/python3.8/site-packages/statsmodels/regression/linear_model.py in get_robustcov_results(self, cov_type, use_t, **kwargs)
   2530             groupidx = lzip([0] + tt, tt + [nobs_])
   2531             self.n_groups = n_groups = len(groupidx)
-> 2532             res.cov_params_default = sw.cov_nw_panel(self, maxlags, groupidx,
   2533                                                      weights_func=weights_func,
   2534                                                      use_correction=use_correction)

~/opt/anaconda3/envs/pyr/lib/python3.8/site-packages/statsmodels/stats/sandwich_covariance.py in cov_nw_panel(results, nlags, groupidx, weights_func, use_correction)
    785     xu, hessian_inv = _get_sandwich_arrays(results)
    786 
--> 787     S_hac = S_nw_panel(xu, weights, groupidx)
    788     cov_hac = _HCCM2(hessian_inv, S_hac)
    789     if use_correction:

~/opt/anaconda3/envs/pyr/lib/python3.8/site-packages/statsmodels/stats/sandwich_covariance.py in S_nw_panel(xw, weights, groupidx)
    723     S = weights[0] * np.dot(xw.T, xw)  #weights just for completeness
    724     for lag in range(1, nlags+1):
--> 725         xw0, xwlag = lagged_groups(xw, lag, groupidx)
    726         s = np.dot(xw0.T, xwlag)
    727         S += weights[lag] * (s + s.T)

~/opt/anaconda3/envs/pyr/lib/python3.8/site-packages/statsmodels/stats/sandwich_covariance.py in lagged_groups(x, lag, groupidx)
    706 
    707     if out0 == []:
--> 708         raise ValueError('all groups are empty taking lags')
    709     #return out0, out_lagged
    710     return np.vstack(out0), np.vstack(out_lagged)

ValueError: all groups are empty taking lags

Tags: inpyselfusetypecovgroupsopt
1条回答
网友
1楼 · 发布于 2024-07-01 08:14:54

正如我们的救世主约瑟夫非常正确地指出的那样。问题只是数据没有按照我在示例中显示的方式进行排序

数据应该按您正在聚类的组(在我的例子中是account)排序,然后按时间排序。这是根据文件:

‘hac-panel’ heteroscedasticity and autocorrelation robust standard errors in panel data. The data needs to be sorted in this case, the time series for each panel unit or cluster need to be stacked. The membership to a timeseries of an individual or group can be either specified by group indicators or by increasing time periods.

如果出于任何原因,您不能像约瑟夫指出的那样对数据进行排序,那么您可以使用HAC-groupsum,即使数据没有排序,它也可以工作。结果当然略有不同

相关问题 更多 >

    热门问题