Scipy稀疏矩阵循环永远需要提高效率

2024-09-30 06:20:43 发布

您现在位置:Python中文网/ 问答频道 /正文

什么是最有效的时间和时间方法;使用稀疏矩阵编写此循环的内存方式(当前使用csc_矩阵)

for j in range(0, reducedsize):
    xs = sum(X[:, j])
    X[:, j] = X[:, j] / xs.data[0]

例如:

缩小尺寸(内部)-2500
X(csc_矩阵)-908x2500

循环确实会迭代,但与仅使用numpy相比,它需要很长的时间


Tags: 方法内存innumpyfordata尺寸方式
2条回答

如果您想在没有任何内存开销的情况下(就地)执行此操作

始终考虑数据的实际存储方式。 一个关于csc matrix的小例子

shape=(5,5)
X=sparse.random(shape[0], shape[1], density=0.5, format='csc')
print(X.todense())

[[0.12146814 0.         0.         0.04075121 0.28749552]
 [0.         0.92208639 0.         0.44279661 0.        ]
 [0.63509196 0.42334964 0.         0.         0.99160443]
 [0.         0.         0.25941113 0.44669367 0.00389409]
 [0.         0.         0.         0.         0.83226886]]

i=0 #first column
print(X.data[X.indptr[i]:X.indptr[i+1]])
[0.12146814 0.63509196]

一个简单的解决方案

因此,我们在这里要做的唯一一件事就是在适当的位置逐列修改非零条目。这可以使用部分矢量化的numpy解决方案轻松完成data就是包含所有非零值的数组,indptr存储每列开始和结束的信息

def Numpy_csc_norm(data,indptr):
    for i in range(indptr.shape[0]-1):
        xs = np.sum(data[indptr[i]:indptr[i+1]])
        #Modify the view in place
        data[indptr[i]:indptr[i+1]]/=xs

就性能而言,此就地解决方案已经不太糟糕了。如果您想进一步提高性能,可以使用Cython/Numba/或其他一些编译代码,这些代码可以或多或少地用Python打包

麻木的解决方案

@nb.njit(fastmath=True,error_model="numpy",parallel=True)
def Numba_csc_norm(data,indptr):
    for i in nb.prange(indptr.shape[0]-1):
        acc=0
        for j in range(indptr[i],indptr[i+1]):
            acc+=data[j]
        for j in range(indptr[i],indptr[i+1]):
            data[j]/=acc

性能

#Create a not to small example matrix
shape=(50_000,10_000)
X=sparse.random(shape[0], shape[1], density=0.001, format='csc')

#Not in-place from hpaulj
def hpaulj(X):
    acc=X.sum(axis=0)
    return X.multiply(sparse.csr_matrix(1./acc))

%timeit X2=hpaulj(X)
#6.54 ms ± 67.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#All 2 variants are in-place, 
#but this shouldn't have a influence on the timings

%timeit Numpy_csc_norm(X.data,X.indptr)
#79.2 ms ± 914 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

#parallel=False -> faster on tiny matrices
%timeit Numba_csc_norm(X.data,X.indptr)
#626 µs ± 30.6 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

#parallel=True -> faster on larger matrices
%timeit Numba_csc_norm(X.data,X.indptr)
#185 µs ± 5.39 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [388]: from scipy import sparse                                                      

制作一个样本矩阵:

In [390]: M = sparse.random(10,8,.2, 'csc')                                             

矩阵和:

In [393]: M.sum(axis=0)                                                                 
Out[393]: 
matrix([[1.95018736, 0.90924629, 1.93427113, 2.38816133, 1.08713479,
         0.        , 2.45435481, 0.        ]])

当在结果中除法-和nan时,这些0会产生警告:

In [394]: M/_                                                                           
/usr/local/lib/python3.6/dist-packages/scipy/sparse/base.py:599: RuntimeWarning: invalid value encountered in true_divide
  return np.true_divide(self.todense(), other)
Out[394]: 
matrix([[0.        , 0.        , 0.        , 0.        , 0.27079623,
                nan, 0.13752665,        nan],
        [0.        , 0.        , 0.        , 0.        , 0.        ,
                nan, 0.32825122,        nan],
        [0.        , 0.        , 0.        , 0.        , 0.        ,
                nan, 0.        ,        nan],
 ...
                nan, 0.        ,        nan]])

0也会给您的方法带来问题:

In [395]: for i in range(8): 
     ...:     xs = sum(M[:,i]) 
     ...:     M[:,i] = M[:,i]/xs.data[0] 
     ...:                                                                               
                                     -
IndexError                                Traceback (most recent call last)
<ipython-input-395-0195298ead19> in <module>
      1 for i in range(8):
      2     xs = sum(M[:,i])
  > 3     M[:,i] = M[:,i]/xs.data[0]
      4 

IndexError: index 0 is out of bounds for axis 0 with size 0

但如果我们比较不带0的列,则值匹配:

In [401]: Out[394][:,:5]                                                                
Out[401]: 
matrix([[0.        , 0.        , 0.        , 0.        , 0.27079623],
        [0.        , 0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        , 0.        ],
        [0.49648886, 0.25626608, 0.        , 0.19162678, 0.72920377],
        [0.        , 0.        , 0.30200765, 0.        , 0.        ],
        [0.50351114, 0.        , 0.30445113, 0.41129367, 0.        ],
        [0.        , 0.74373392, 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.39354122, 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.39707955, 0.        ]])
In [402]: M.A[:,:5]                                                                     
Out[402]: 
array([[0.        , 0.        , 0.        , 0.        , 0.27079623],
       [0.        , 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        , 0.        ],
       [0.49648886, 0.25626608, 0.        , 0.19162678, 0.72920377],
       [0.        , 0.        , 0.30200765, 0.        , 0.        ],
       [0.50351114, 0.        , 0.30445113, 0.41129367, 0.        ],
       [0.        , 0.74373392, 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.39354122, 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.39707955, 0.        ]])

在[394]中,我应该首先将矩阵和转换为稀疏,因此结果也是稀疏的。稀疏矩阵没有元素除法,所以我必须先求稠密矩阵的逆。0仍然是一个讨厌的东西

In [409]: M.multiply(sparse.csr_matrix(1/Out[393]))                                     
...
Out[409]: 
<10x8 sparse matrix of type '<class 'numpy.float64'>'
    with 16 stored elements in Compressed Sparse Column format>

相关问题 更多 >

    热门问题