sklearn和sklearn的chi2值不同scipy.统计对于相同的应急选项卡

2024-10-03 04:27:01 发布

您现在位置:Python中文网/ 问答频道 /正文

X=np.array([7.20E+01,2.40E+01,0.00E+00,9.00E+00,0.00E+00,3.00E+00,0.00E+00,5.40E01,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,3.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,1.50E+01,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,1.11E+02,2.70E+01,0.00E+00,6.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
3.00E+00,
0.00E+00,
0.00E+00,
1.70E+01,
3.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
8.00E+00,
5.20E+01,
1.80E+01,
5.20E+01,
5.20E+01,
5.00E+01,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00,
0.00E+00])


y=np.array([0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
0.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00
1.00E+00])

这是X(为了简单起见,我现在只使用了一个特性)和y样品前73人属于(0)类,其余73人属于(1)类。你知道吗

现在我想计算这个特性的chi2分数sklearn.feature\u选择.chi2给了我答案579,如果我在scipy.stats.chi2\意外事故答案是21。你知道吗

使用的代码-

obs = np.array([[0, 19], [73,54]])
scipy.stats.chi2_contingency(obs,correction=False)

这给出了21作为答案,我认为这应该是正确的答案,因为公式是(a*d-b*c)**2*float(n)/((a+c)*(b+d)*(a+b)*(c+d))

但是sklearn给了579个密码-

X_d= X.reshape(-1,1)
y_d=y.reshape(-1,1)
print(sklearn.feature_selection.chi2(X_d, y_d))

为什么两种情况下chi2值不同?你知道吗

编辑-如何在scipy案例中创建列联表

enter image description here

enter image description here

参考公式-

所以我在0(负类)中得到的非零值的数目是19。样本的总数是146,在正类中得到的非零值的数目是0。基于这个信息,我得到的关于一个bc D N的值是0 19 73 54 146,我把它输入到scipy.统计功能。你知道吗


Tags: 答案statsnp样品scipy特性sklearnarray