<p>虽然<code>csr</code>的行选择比列选择更快,但差别不大:</p>
<pre><code>In [288]: Mbig=sparse.rand(1000,1000,.1, 'csr')
In [289]: Mbig[:1000:50,:]
Out[289]:
<20x1000 sparse matrix of type '<class 'numpy.float64'>'
with 2066 stored elements in Compressed Sparse Row format>
In [290]: timeit Mbig[:1000:50,:]
1000 loops, best of 3: 1.53 ms per loop
In [291]: timeit Mbig[:,:1000:50]
100 loops, best of 3: 2.04 ms per loop
In [292]: Mbig=sparse.rand(1000,1000,.1, 'csc')
In [293]: timeit Mbig[:1000:50,:]
100 loops, best of 3: 2.16 ms per loop
In [294]: timeit Mbig[:,:1000:50]
1000 loops, best of 3: 1.65 ms per loop
</code></pre>
<p>转换格式是不值得的</p>
^{pr2}$
<p>与致密版的相同切片进行对比:</p>
<pre><code>In [297]: A=Mbig.A
In [298]: timeit A[:,:1000:50]
...
1000000 loops, best of 3: 557 ns per loop
In [301]: timeit A[:,:1000:50].copy()
...
10000 loops, best of 3: 52.5 µs per loop
</code></pre>
<p>为了使比较复杂化,使用数组(<code>numpy</code>advanced)建立索引实际上比使用“slice”更快:</p>
<pre><code>In [308]: idx=np.r_[0:1000:50] # expand slice into array
In [309]: timeit Mbig[idx,:]
1000 loops, best of 3: 1.49 ms per loop
In [310]: timeit Mbig[:,idx]
1000 loops, best of 3: 513 µs per loop
</code></pre>
<p>在这里,<code>csc</code>的列索引有更大的速度改进。在</p>
<p>而单行或列,<code>csr</code>和<code>csc</code>有{<cd6>}方法:</p>
<pre><code>In [314]: timeit Mbig.getrow(500)
1000 loops, best of 3: 434 µs per loop
In [315]: timeit Mbig.getcol(500) # 1 column from csc is fastest
10000 loops, best of 3: 78.7 µs per loop
In [316]: timeit Mbig[500,:]
1000 loops, best of 3: 505 µs per loop
In [317]: timeit Mbig[:,500]
1000 loops, best of 3: 264 µs per loop
</code></pre>
<p>在<a href="https://stackoverflow.com/a/39500986/901925">https://stackoverflow.com/a/39500986/901925</a>中,我重新创建了<code>extractor</code>代码,<code>sparse</code>用来获取行或列。它构造了一个新的1和0的稀疏“向量”,并使用矩阵乘法来“选择”行或列。在</p>