我正在学习在python中使用多处理,我有一个问题。我想计算对象(即单词元组)在列表中的次数。我提议两种选择。第一个使用pool.starmap\u async,第二个不使用多处理
ngrams=[('review', 'productivity'), ('productivity', 'satisfaction'), ('satisfaction', 'democratic'), ('democratic', 'autocratic'), ('autocratic', 'leadership'), ('leadership', 'empirical'), ('empirical', 'literature'), ('literature', 'explore'), ('explore', 'organizational_outcome'), ('organizational_outcome', 'democratic'), ('democratic', 'leadership'), ('leadership', 'task##oriented'), ('task##oriented', 'group'), ('group', 'individual'), ('individual', 'member'), ('member', 'productivity'), ('productivity', 'satisfaction'), ('satisfaction', 'receive'), ('receive', 'attention'), ('attention', 'emphasis')]
ngrams_uniq=[('satisfaction', 'democratic'), ('organizational_outcome', 'democratic'), ('review', 'productivity'), ('democratic', 'leadership'), ('member', 'productivity'), ('receive', 'attention'), ('empirical', 'literature'), ('group', 'individual'), ('literature', 'explore'), ('democratic', 'autocratic'), ('autocratic', 'leadership'), ('attention', 'emphasis'), ('task##oriented', 'group'), ('explore', 'organizational_outcome'), ('leadership', 'task##oriented'), ('satisfaction', 'receive'), ('productivity', 'satisfaction'), ('leadership', 'empirical'), ('individual', 'member')]
def count_ngrams(gram,ngrams):
return (gram,ngrams.count(gram))
##带游泳池
print(time.strftime("%H:%M:%S"))
pool = mp.Pool(mp.cpu_count())
dict_freq_ngrams=pool.starmap_async(count_ngrams,[(gram,ngrams) for gram in ngrams_uniq]).get()
pool.close()
print(time.strftime("%H:%M:%S"))
##没有游泳池
print(time.strftime("%H:%M:%S"))
dict_freq_ngrams=[count_ngrams(gram,ngrams) for gram in ngrams_uniq]
print(time.strftime("%H:%M:%S"))
当我测量执行时间时,我总是发现第二个选项更快。我不明白为什么会这样。。。也许我有个错误,但我不知道是什么
提前谢谢
我不认为你有错误,而是把数据从多重处理复制到新的解释器的开销超过了并行计算的速度增益 因为在我的水面上,刚开始游泳需要0.2到0.3秒
她是我用来测试的代码
相关问题 更多 >
编程相关推荐