Python2.7-statsmodels-格式化和编写摘要输出

Logit Regression Results ============================================================================== Dep. Variable: death_death No. Observations: 9752 Model: Logit Df Residuals: 9747 Method: MLE Df Model: 4 Date: Wed, 22 May 2013 Pseudo R-squ.: -0.02672 Time: 22:15:05 Log-Likelihood: -5806.9 converged: True LL-Null: -5655.8 LLR p-value: 1.000 =============================================================================== coef std err z P>|z| [95.0% Conf. Int.] ------------------------------------------------------------------------------- age_age5064 -0.1999 0.055 -3.619 0.000 -0.308 -0.092 age_age6574 -0.2553 0.053 -4.847 0.000 -0.359 -0.152 sex_female -0.2515 0.044 -5.765 0.000 -0.337 -0.166 stage_early -0.1838 0.041 -4.528 0.000 -0.263 -0.104 access -0.0102 0.001 -16.381 0.000 -0.011 -0.009 ===============================================================================

`Log-Likelihood, age_age5064_coef, age_age5064_std_err, age_age5064_z, age_age5064_p>|z|,...age_age6574_coef, age_age6574_std_err, ......access_coef, access_std_err, ....age_age5064_odds_ratio, age_age6574_odds_ratio, ...sex_female_odds_ratio,.....access_odds_ratio`

3条回答

网友

1楼 · 编辑于 2024-09-30 00:35:41

我发现这个公式有点直截了当。通过遵循示例中的语法（pvals、coeff、conf_lower、conf_higher），可以添加/减去列。

import pandas as pd     #This can be left out if already present...

def results_summary_to_dataframe(results):
    '''This takes the result of an statsmodel results table and transforms it into a dataframe'''
    pvals = results.pvalues
    coeff = results.params
    conf_lower = results.conf_int()[0]
    conf_higher = results.conf_int()[1]

    results_df = pd.DataFrame({"pvals":pvals,
                               "coeff":coeff,
                               "conf_lower":conf_lower,
                               "conf_higher":conf_higher
                                })

    #Reordering...
    results_df = results_df[["coeff","pvals","conf_lower","conf_higher"]]
    return results_df

网友

2楼 · 编辑于 2024-09-30 00:35:41

当前没有参数及其结果统计信息的预制表。

本质上，您需要自己堆叠所有结果，无论是在列表、numpy数组还是pandas数据帧中，这取决于什么对您更方便。

例如，如果我想要一个numpy数组，它在summary参数表中包含模型的结果llf和结果，那么我可以使用

res_all = []
for res in results:
    low, upp = res.confint().T   # unpack columns 
    res_all.append(numpy.concatenate(([res.llf], res.params, res.tvalues, res.pvalues, 
                   low, upp)))

不过，根据不同模型的结构，最好与熊猫保持一致。

您可以编写一个helper函数，从results实例中获取所有结果并将它们连接成一行。

（我不确定哪种方式最方便按行写入csv）

编辑：

下面是一个将回归结果存储在数据帧中的示例

https://github.com/statsmodels/statsmodels/blob/master/statsmodels/sandbox/multilinear.py#L21

回路在159号线上。

summary（）和statsmodels之外的类似代码，例如用于组合多个结果的http://johnbeieler.org/py_apsrtable/，面向打印而不是存储变量。

网友

3楼 · 编辑于 2024-09-30 00:35:41

results.params：系数
results.p values:对于p值

顺便说一下，您可以使用dir（results）查找对象的所有属性

相关问题更多 >

编程相关推荐

热门问题

热门文章