循环的多重处理(numpy.ndarray公司)

2024-09-27 00:20:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用Python2.7的多处理模块来提高numpy数组的循环速度。为了计算PMI矩阵,我使用了一个已经创建的矩阵‘C’,它有6018行和27721列。但是,在运行下面的代码时,我得到一个“[Errno 12]Cannot allocate memory”。我假设错误与变量PMI有关,因为如果我将它移到PMI创建函数中(使PMI成为局部变量,但我自然希望它成为全局变量),那么error语句就会消失,但是它没有用,因为我需要程序记住对PMI变量所做的更新。有什么办法解决这个问题吗?你知道吗

制定相互信息矩阵

print "Creating mutual information matrix"   
PMI = np.zeros((C.shape))     

def pmiCreation(indexStart):                         
    N = C.sum()
    invN = 1/N  # replaced divide by N with multiply by invN in formula below
    row, col = C.shape      
    print "Creating mutual information matrix using indexStart:",indexStart         

    for r in range(row)[indexStart:indexStart+346]:  # u
        for c in range(r):  # w
            if C[r,c]!=0:  # if they co-occur
                num = C[r,c]*invN  # getting number of reviews where u and w co-occur and multiply by invN (numerator)
                denom = (sum(C[:,c])*invN) * (sum(C[r])*invN)
                pmi = log10(num*(1/denom))           
                PMI[r,c] = pmi
                PMI[c,r] = pmi

pool = Pool(processes=8)       # process per core    
index_inputs = [0,346,692,1038,1384, 1730,2166, 2512,2858]    
pool.map(pmiCreation, index_inputs)  

Tags: increatingbyinformation矩阵matrixsumprint

热门问题