<p>正是数据<code>total_features[400:]</code>的子集导致了问题。如果您查看该数据,您将看到列<code>total_features[400:, 1]</code>和{<cd3>}都是0。这会导致代码中出现问题,因为这些列的平均值和标准偏差都是0,结果是0/0。在</p>
<p>您可以使用<a href="http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html" rel="nofollow noreferrer">^{<cd4>}</a>,而不是编写自己的正则化函数。该函数通过返回全部为0的列来处理常量列。在</p>
<p>您可以轻松验证<code>scale</code>与您的<code>regularise</code>执行相同的计算:</p>
<pre><code>In [68]: test
Out[68]:
array([[ 15., 1., 0.],
[ 3., 4., 5.],
[ 6., 7., 8.],
[ 9., 10., 11.],
[ 12., 13., 1.]])
In [69]: regularise(test)
Out[69]:
array([[ 1.41421356, -1.41421356, -1.20560706],
[-1.41421356, -0.70710678, 0. ],
[-0.70710678, 0. , 0.72336423],
[ 0. , 0.70710678, 1.44672847],
[ 0.70710678, 1.41421356, -0.96448564]])
In [70]: from sklearn.preprocessing import scale
In [71]: scale(test)
Out[71]:
array([[ 1.41421356, -1.41421356, -1.20560706],
[-1.41421356, -0.70710678, 0. ],
[-0.70710678, 0. , 0.72336423],
[ 0. , 0.70710678, 1.44672847],
[ 0.70710678, 1.41421356, -0.96448564]])
</code></pre>
<p>下面显示函数如何处理由零组成的列:</p>
^{pr2}$