<pre><code>In [388]: from scipy import sparse
</code></pre>
<p>制作一个样本矩阵:</p>
<pre><code>In [390]: M = sparse.random(10,8,.2, 'csc')
</code></pre>
<p>矩阵和:</p>
<pre><code>In [393]: M.sum(axis=0)
Out[393]:
matrix([[1.95018736, 0.90924629, 1.93427113, 2.38816133, 1.08713479,
0. , 2.45435481, 0. ]])
</code></pre>
<p>当在结果中除法-和<code>nan</code>时,这些0会产生警告:</p>
<pre><code>In [394]: M/_
/usr/local/lib/python3.6/dist-packages/scipy/sparse/base.py:599: RuntimeWarning: invalid value encountered in true_divide
return np.true_divide(self.todense(), other)
Out[394]:
matrix([[0. , 0. , 0. , 0. , 0.27079623,
nan, 0.13752665, nan],
[0. , 0. , 0. , 0. , 0. ,
nan, 0.32825122, nan],
[0. , 0. , 0. , 0. , 0. ,
nan, 0. , nan],
...
nan, 0. , nan]])
</code></pre>
<p>0也会给您的方法带来问题:</p>
<pre><code>In [395]: for i in range(8):
...: xs = sum(M[:,i])
...: M[:,i] = M[:,i]/xs.data[0]
...:
-
IndexError Traceback (most recent call last)
<ipython-input-395-0195298ead19> in <module>
1 for i in range(8):
2 xs = sum(M[:,i])
> 3 M[:,i] = M[:,i]/xs.data[0]
4
IndexError: index 0 is out of bounds for axis 0 with size 0
</code></pre>
<p>但如果我们比较不带0的列,则值匹配:</p>
<pre><code>In [401]: Out[394][:,:5]
Out[401]:
matrix([[0. , 0. , 0. , 0. , 0.27079623],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0.49648886, 0.25626608, 0. , 0.19162678, 0.72920377],
[0. , 0. , 0.30200765, 0. , 0. ],
[0.50351114, 0. , 0.30445113, 0.41129367, 0. ],
[0. , 0.74373392, 0. , 0. , 0. ],
[0. , 0. , 0.39354122, 0. , 0. ],
[0. , 0. , 0. , 0.39707955, 0. ]])
In [402]: M.A[:,:5]
Out[402]:
array([[0. , 0. , 0. , 0. , 0.27079623],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0. , 0. , 0. , 0. , 0. ],
[0.49648886, 0.25626608, 0. , 0.19162678, 0.72920377],
[0. , 0. , 0.30200765, 0. , 0. ],
[0.50351114, 0. , 0.30445113, 0.41129367, 0. ],
[0. , 0.74373392, 0. , 0. , 0. ],
[0. , 0. , 0.39354122, 0. , 0. ],
[0. , 0. , 0. , 0.39707955, 0. ]])
</code></pre>
<p>在[394]中,我应该首先将矩阵和转换为稀疏,因此结果也是稀疏的。稀疏矩阵没有元素除法,所以我必须先求稠密矩阵的逆。0仍然是一个讨厌的东西</p>
<pre><code>In [409]: M.multiply(sparse.csr_matrix(1/Out[393]))
...
Out[409]:
<10x8 sparse matrix of type '<class 'numpy.float64'>'
with 16 stored elements in Compressed Sparse Column format>
</code></pre>