我将Pool.map()函数应用于我的用户定义函数,该函数从DB加载数据,但效果不好它保持无限循环。
在不使用“pool.map()”的情况下,在每个jupyter笔记本中分别从DB导入数据是没有问题的。您知道从DB单独加载和同时使用map()加载有什么区别吗?
如果我知道这两个过程之间的区别,我想我可以找到解决循环和不停止问题的线索
当我不使用用户定义的函数而是使用'sum()'或任何其他基本函数时,我可以意识到使用'pool.map()'的并行处理执行得很好
def parallelized():
with Pool(processes = 8) as pool:
if __name__ == '__main__':
df = pool.map(minc, yms)
pool.close()
pool.join()
return df
parallelized()
yms = ['201901','201902','201903','201904','201905','201906','201907','201908','201909','201910','201911','201912','201801','201802','201803','201804','201805','201806','201807','201808','201809','201810','201811','201812']
def minc(ym):
print('MINC %s %s\n' %(ym, str(datetime.datetime.now())))
print("value %s is in PID : %s \n" % (ym, os.getpid()))
t = datetime.datetime.now()
minc1 = pd.read_sql("""
select substring(MINC_IN_YM,1,4) as YEAR,substring(MINC_IN_YM,5,2) as
MONTH,
MINC_VNDCD as 'FROM',
MINC_BRNCD+''+MINC_BRNCD_WHS as 'TO',
MINC_PTNO as PTNO,
count(MINC_INSP_NO) as NROWS,
sum(MINC_OKQTY) as TOTAL_QUANTITY,
sum(MINC_AV_PRICE*MINC_OKQTY) as TOTAL_DOLLARS
from dwadm.W_MINC
where MINC_INC_INF in ('RN','CN')
and MINC_ACCID in ('A', 'G', 'V')
and MINC_IN_YM ='%s'
and substring(MINC_BRNCD, 1, 1) not in ('S','C')
GROUP BY YEAR,MONTH, MINC_BRNCD,MINC_BRNCD_WHS,MINC_VNDCD, MINC_PTNO
""" % ym , conn)
print(' MINC ends %s1 %s %s\n' %(ym,
str(datetime.datetime.now()),str(datetime.datetime.now()-t)))
return minc1
将
if __name__ == '__main__':
放在parallelized()
调用的顶层相关问题 更多 >
编程相关推荐