为什么导入本地模块会导致Concurrent.futures.processPool执行器抛出BrokenProcessPool异常?
导入本地模块时,运行文件时会出现BrokenProcessPool异常。我试着对模块中的所有内容进行注释,得到了相同的结果。我还尝试了其他文件/模块,得到了相同的结果。但是,如果我将import语句注释掉,或者将它放在main()函数中,它将在不终止进程和引发异常的情况下工作。我用其他本地模块尝试了同样的方法,得到了相同的结果。为什么会发生这种情况?我可以做些什么来避免异常?你知道吗
我想用期货与ProcessPoolExecutor。我的代码示例基于这个问题的首要答案:Parallelize apply after pandas groupby
以下是我的版本:
import pandas as pd
import numpy as np
import time
from concurrent.futures import ProcessPoolExecutor, as_completed
import analysis_helper # a local module
print(__name__)
nrows = 15000
np.random.seed(1980)
df = pd.DataFrame({'a': np.random.permutation(np.arange(nrows))})
def f1(group):
time.sleep(0.0001)
return group
def main():
with ProcessPoolExecutor(12) as ppe:
futures = []
results = []
for name, group in df.groupby('a'):
p = ppe.submit(f1, group)
futures.append(p)
for future in as_completed(futures):
r = future.result()
results.append(r)
df_output = pd.concat(results)
print(df_output)
if __name__ == '__main__':
main()
删除分析辅助程序的结果:
runfile('C:/dev/.../test_parallelizer_pandas.py', wdir='C:/dev/...')
__main__
a
1255 1733
3372 11015
5318 4571
7076 14510
10545 10749
3340 483
11844 3736
3681 14509
2222 1041
3640 11014
4288 7852
12257 1040
2101 11034
14938 3065
8449 1842
7231 10746
7509 4353
4898 3797
2941 866
7497 14520
8302 11013
13882 9924
12007 1042
1567 10747
13135 7856
7742 485
13709 12571
1946 11012
5634 7848
7044 4354
...
3441 14213
179 14361
6723 12134
7528 5905
9273 12420
9916 3614
134 10166
11654 5854
11848 12133
14055 4278
6100 14360
726 14981
13139 14982
12552 14983
5393 14984
6927 14986
8108 14985
12665 14987
8587 14988
11437 14989
4191 14990
6877 14991
4997 14994
13527 14995
9477 14993
2930 14996
5456 14992
781 14997
3287 14998
13386 14999
[15000 rows x 1 columns]
分析辅助程序的结果:
runfile('C:/dev/.../test_parallelizer_pandas.py', wdir='C:/dev/...')
__main__
Traceback (most recent call last):
File "<ipython-input-7-7d6a88ec5a87>", line 1, in <module>
runfile('C:/dev/.../test_parallelizer_pandas.py', wdir='C:/dev/...')
File "C:\Users\david\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "C:\Users\david\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/dev/.../test_parallelizer_pandas.py", line 42, in <module>
main()
File "C:/dev/.../test_parallelizer_pandas.py", line 35, in main
r = future.result()
File "C:\Users\david\Anaconda3\lib\concurrent\futures\_base.py", line 425, in result
return self.__get_result()
File "C:\Users\david\Anaconda3\lib\concurrent\futures\_base.py", line 384, in __get_result
raise self._exception
BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
注意:这只发生在ProcessPoolExecutor上,而不是ThreadPoolExecutor。你知道吗
如果您使用的是macOS,那么定义以下环境变量之一可以作为临时解决方法。你知道吗
阅读有关问题here的更多信息
阅读有关问题here的更多信息
第二个选项在MacOS10.14.1和Python3.6.0上对我有效
相关问题 更多 >
编程相关推荐