为什么在这段代码中,statsmodel库的cochrans Q测试的输出没有变化?

2024-10-03 04:39:02 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用以下代码:

from statsmodels.stats.contingency_tables import cochrans_q 
res = cochrans_q([[1,4,5],[9,6,8]])
print(res)

输出为:

df          2
pvalue      0.36787944117144245
statistic   2.0

但是[[10,4,5],[9,6,8]][[55,88,77],[99,46,88]]等的输出保持不变

statsmodels文档页面为here,科克伦Q测试的维基百科页面为here

问题在哪里?如何解决?谢谢你的帮助


Tags: 代码fromimportdftablesherestatsres
2条回答

以下是每个结果的定义:


df: In statistics, the degrees of freedom (DF) indicate the number of independent values that can vary in an analysis without breaking any constraints. It is an essential idea that appears in many contexts throughout statistics including hypothesis tests, probability distributions, and regression analysis.


pvalue: In statistics, the p-value is the probability of obtaining results as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct.


statistic: A statistic (singular) or sample statistic is any quantity computed from values in a sample that is used for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypothesis.


如您所见,定义与每个元素的值无关,重要的是数量

您的程序为您的用户返回相同的内容
[[1,4,5],[9,6,8]][[10,4,5],[9,6,8]][[55,88,77],[99,46,88]]
因为每个列表中有3个元素

Cochran的Q检验是一种非参数方法,用于在三个或三个以上频率或比例的匹配集合中发现差异。这意味着,当执行Cochran的Q测试时,每个元素的值并不重要

cochrans_q采用二进制数据,不进行计数

在statsmodels中,文档通常不是很明确,但是可以从单元测试中看到预期的行为

下面的单元测试演示如何将频率数据转换为statsmodels所需的格式

资料来源:https://github.com/statsmodels/statsmodels/blob/master/statsmodels/stats/tests/test_nonparametric.py#L190

def test_cochransq3():
    # another example compared to SAS
    # in frequency weight format
    dt = [('A', 'S1'), ('B', 'S1'), ('C', 'S1'), ('count', int)]
    dta = np.array([('F', 'F', 'F', 6),
                    ('U', 'F', 'F', 2),
                    ('F', 'F', 'U', 16),
                    ('U', 'F', 'U', 4),
                    ('F', 'U', 'F', 2),
                    ('U', 'U', 'F', 6),
                    ('F', 'U', 'U', 4),
                    ('U', 'U', 'U', 6)], dt)

    cases = np.array([[0, 0, 0],
                      [1, 0, 0],
                      [0, 0, 1],
                      [1, 0, 1],
                      [0, 1, 0],
                      [1, 1, 0],
                      [0, 1, 1],
                      [1, 1, 1]])
    count = np.array([ 6,  2, 16,  4,  2,  6,  4,  6])
    data = np.repeat(cases, count, 0)

    res = cochrans_q(data)
    assert_allclose([res.statistic, res.pvalue], [8.4706, 0.0145], atol=5e-5)

相关问题 更多 >