我有一个求和的循环:
for t in reversed(range(len(inputs))):
dy = np.copy(ps[t])
dy[targets[t]] -= 1
dWhy += np.dot(dy, hs[t].T)
dby += dy
输入值太大,我必须使它并行。所以我把循环转换成一个单独的函数。我尝试过使用ThreadPoolExecutor,但是结果时间比顺序算法慢。你知道吗
这是我最简单的工作示例:
import numpy as np
import concurrent.futures
import time, random
from concurrent.futures import ThreadPoolExecutor
import threading
#parameters
dWhy = np.random.sample(300)
dby = np.random.sample(300)
def Func(ps, targets, hs, t):
global dWhy, dby
dy = np.copy(ps[t])
dWhy += np.dot(dy, hs[t].T)
dby += dy
return dWhy, dby
if __name__ == '__main__':
ps = np.random.sample(100000)
targets = np.random.sample(100000)
hs = np.random.sample(100000)
start = time.time()
for t in range(100000):
dy = np.copy(ps[t])
dWhy += np.dot(dy, hs[t].T)
dby += dy
finish = time.time()
print("One thread: ")
print(finish-start)
dWhy = np.random.sample(300)
dby = np.random.sample(300)
start = time.time()
with concurrent.futures.ThreadPoolExecutor() as executor:
args = ((ps, targets, hs, t) for t in range(100000))
for out1, out2 in executor.map(lambda p: Func(*p), args):
dWhy, dby = out1, out2
finish = time.time()
print("Multithreads time: ")
print(finish-start)
在我的电脑上,单线程时间~3s,多线程时间~1分钟。你知道吗
考虑用广播代替:
当它跑快20000倍的时候
将lambda转换为命名函数。你知道吗
相关问题 更多 >
编程相关推荐