多处理池管理器命名空间EOF

2024-10-02 12:29:34 发布

您现在位置:Python中文网/ 问答频道 /正文

当我使用pool.manager.namespace名称空间若要共享一个数据帧,每个目标函数都将调用.sample(5000)到此数据帧,则会发生EOF错误。你知道吗

def get_sample(i):
    print("start round {}".format(i))
    sample = sharedData.data.sample(5000, random_state=i)

if __name__=='__main__':
    with mp.Pool(cpu_count(logical=False)) as pool0:
        results = pool0.map(load_data, paths)
        sharedData.data = pd.concat(results, axis=0, copy=False)
        genes = sharedData.data.columns
        pool0.close()
        pool0.join()
        del results

    """sampling"""
    with mp.Pool(cpu_count(logical=True)) as pool:
        print("start sampling, total round = {}".format(1000))
        r = pool.map_async(get_sample, [j for j in range(1000)], error_callback=my_error)
        results2 = r.get()
        pool.close()
        pool.join()

有回溯:

start round 145
round35 returns output
round18 returns output
rount161 returns output
start round 704
start round 720
start round 736
start round 752
start round 768
start round 784
start round 800
start round 816
start round 832
start round 848
start round 864
start round 880
start round 896
start round 912
start round 928
start round 944
start round 960
start round 976
start round 992
from error_callback: 

multiprocessing.pool.RemoteTraceback: 
multiprocessing.pool.RemoteTraceback: 
"""

Traceback (most recent call last):
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "sampling2temp.py", line 38, in get_sample_ys
    sample = sharedData.data.sample(5000, random_state=i)
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/managers.py", line 1060, in __getattr__
    return callmethod('__getattribute__', (key,))
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod
    kind, result = conn.recv()
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "sampling2temp.py", line 105, in <module>
    results2 = r.get()
  File "/usr/usc/python/3.6.0/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value
EOFError

似乎任务704到992根本不返回任何输出,然后管理器进程关闭。所以当一个正在运行的任务从manager.namespace.data文件,它接收EOF。你知道吗

顺便说一下,如果我把样品(5000)改成样品(2500),再改变样品的尺寸Manager.Namespace.data文件从2127096024字节到1738281624字节,没有EOF问题。这是不是因为每个工人都占用了太多的内存?你知道吗


Tags: sampleinpydatagetlibusrline
1条回答
网友
1楼 · 发布于 2024-10-02 12:29:34

如果所有相关的发送方连接都已关闭,multiprocessing.Connection接收方将抛出eoferor。你知道吗

看起来像多处理管理器正在使用多处理连接在引擎盖下根据堆栈轨迹。因为看起来代码并没有过早终止管理器进程,所以我认为问题一定是管理器进程是hitting an exception and terminating before you are done with it。由于减小样本大小似乎可以解决问题,it's possible the Manager process gets killed off by the OOM killer for using too much memory-您可以使用链接文章中建议的命令检查是否是这样:

dmesg | egrep -i "killed process"

你可能会看到这样的情景:

host kernel: Out of Memory: Killed process 1234 (python).

相关问题 更多 >

    热门问题