我正在尝试使用Python2.7的多处理模块来提高numpy数组的循环速度。为了计算PMI矩阵,我使用了一个已经创建的矩阵‘C’,它有6018行和27721列。但是,在运行下面的代码时,我得到一个“[Errno 12]Cannot allocate memory”。我假设错误与变量PMI有关,因为如果我将它移到PMI创建函数中(使PMI成为局部变量,但我自然希望它成为全局变量),那么error语句就会消失,但是它没有用,因为我需要程序记住对PMI变量所做的更新。有什么办法解决这个问题吗?你知道吗
print "Creating mutual information matrix"
PMI = np.zeros((C.shape))
def pmiCreation(indexStart):
N = C.sum()
invN = 1/N # replaced divide by N with multiply by invN in formula below
row, col = C.shape
print "Creating mutual information matrix using indexStart:",indexStart
for r in range(row)[indexStart:indexStart+346]: # u
for c in range(r): # w
if C[r,c]!=0: # if they co-occur
num = C[r,c]*invN # getting number of reviews where u and w co-occur and multiply by invN (numerator)
denom = (sum(C[:,c])*invN) * (sum(C[r])*invN)
pmi = log10(num*(1/denom))
PMI[r,c] = pmi
PMI[c,r] = pmi
pool = Pool(processes=8) # process per core
index_inputs = [0,346,692,1038,1384, 1730,2166, 2512,2858]
pool.map(pmiCreation, index_inputs)
目前没有回答
相关问题 更多 >
编程相关推荐