用稀疏矩阵优化大数阵乘法

2024-06-26 12:53:26 发布

您现在位置：Python中文网/ 问答频道 /正文

2663

网友

男 | 程序猿一只，喜欢编程写python代码。

我正在进行以下表格的统计计算：

在哪里

d（数据）是一个矩阵[i=30100x100]
m（模型）是一个大小为[i=30100x100]的矩阵
C^-1（协方差）是一个100²乘100²的对称矩阵

d和C^-1是常数，除了C^-1对称外，没有任何结构。m变化，但总是稀疏的。计算的输出只是一个浮点数。在

我需要在蒙特卡罗模拟中多次执行这种计算，所以速度是关键。使用稀疏数组乘法技术与m相比，使用简单的矩阵点乘可以大大加快速度。但是，下面的slow函数的每次迭代仍需要大约0.1秒才能运行。绝大多数时间（大于98%）花在矩阵乘法上，而不是模型生成函数（generate_model）。如果可能的话，我想把速度提高一个数量级。在

代码和输出粘贴在下面。在

做不工作的事情包括：

升级到“英特尔MKL线性代数”例程（提速率为百分之几，小得惊人）
使用numpy.linalg.multi_dot
利用C^-1是对称的这一事实（这在原则上也不起作用，请参见this mathoverflow question）

某种程度上起作用的事情包括：

预先计算C^-1d，可获得约40%的加速

如何加快代码的速度？依赖于cython、numba等包的解决方案以及“标准”scipy/numy解决方案都是最受欢迎的。提前谢谢！在

from __future__ import division
import numpy as np
import scipy.sparse
import sys 
import timeit

def generate_model(n, size, hw = 8): 
    #model for the data--squares at random locations
    output = np.zeros((n, size, size))
    for i in range(n):
        randx = np.random.randint(hw, size-hw)
        randy = np.random.randint(hw, size-hw)
        output[i,(randx-hw):(randx+hw), (randy-hw):(randy+hw)]=np.random.random((hw*2, hw*2))
    return output

def slow_function(datacube, invcovmatrix, size):
    model = generate_model(30, size) 
    output = 0 
    for i in range(model.shape[0]):
        data = datacube[i,:,:].flatten()
        mu = model[i,:,:].flatten()
        sparsemu = scipy.sparse.csr_matrix(mu)
        output += -0.5* (
                  np.float(-2.0*sparsemu.dot(invcovmatrix).dot(data)) +
                  np.float(sparsemu.dot(sparsemu.dot(invcovmatrix).T))
                  )   
    return output

def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

if __name__ == "__main__":
    size = 100 
    invcovmat    = np.random.random((size**2, size**2))
    #make symmetric for consistency
    invcovmat    = (invcovmat+invcovmat.T)/2
    datacube     = np.random.random((30, size, size))
    #TIMING
    wrapped = wrapper(slow_function, datacube, invcovmat, size)
    times = []
    for i in range(20):
        print i
        times.append(timeit.timeit(wrapped, number = 1)) 
    times.sort()
    print '\n', np.mean(times[0:3]), ' s/iteration; best of 3'

输出：

^{pr2}$

Tags： import for output size model return def np

0条回答

目前没有回答

用稀疏矩阵优化大数阵乘法

相关问题更多 >

编程相关推荐

热门问题

热门文章

用稀疏矩阵优化大数阵乘法

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >