Python多处理：如果进程的子进程有子进程的子进程，join（）应该在哪里调用？

2条回答

网友

1楼 · 编辑于 2024-09-30 01:35:53

join在概念上相当简单-x.join表示“当前执行线程（即进程）在x终止之前，不能越过这个点。”

因此，一般情况下，您不希望main线程继续超过某个点，直到您的所有工作人员都完成了他们的工作。由于您在主线程中执行task0，因此执行join会阻止主线程继续超过该点，直到您的所有工作线程（包括task1和task2）都完成。在

但是等等，我没有`join`在`task1`里！在

没错。但是task1的进程在其所有的task2完成之前仍然不会终止。这与process groups的POSIX概念有关，父进程在其所有子进程终止之前不会终止。那么，让我们看看这个简化示例的输出：

import multiprocessing as mp
from time import sleep

def task2():
    sleep(1)
    print "I am doing something important."

def task1():
    for i in range(2):
        process = mp.Process(target=task2)
        process.start()

    print 'task1 done'

def task0():
    process = mp.Process(target=task1)
    process.start()
    process.join()

if __name__ == '__main__':
    task0()
    print 'all done'

输出：

^{pr2}$
如您所见，task1到达了它的结尾，但直到它的子进程结束才终止——这意味着我们在task0中的join块正确地阻止了我们的主线程在所有worker都终止之前终止。在
有趣的是，当运行没有join的原始脚本时，ps jf的输出是join，只有的修改是time.sleep被扔到{}中，这样我就可以捕捉到它的运行：
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 6780 7385 7385 7385 pts/11 7677 Ss 1000 0:00 bash 7385 7677 7677 7385 pts/11 7677 R+ 1000 0:00 \_ ps jf 6780 6866 6866 6866 pts/7 7646 Ss 1000 0:00 bash 6866 7646 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7646 7647 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7647 7672 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7647 7673 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7647 7674 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7647 7675 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7647 7676 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7646 7648 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7648 7665 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7648 7666 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7648 7667 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7648 7668 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7648 7669 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7646 7649 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7649 7656 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7649 7657 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7649 7658 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7649 7659 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7649 7660 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7646 7650 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7650 7652 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7650 7653 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7650 7654 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7650 7655 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7650 7670 7646 6866 pts/7 7646 S+ 1000 0:00 | \_ python test 7646 7651 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7651 7661 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7651 7662 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7651 7663 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7651 7664 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test 7651 7671 7646 6866 pts/7 7646 S+ 1000 0:00 \_ python test
您可以看到，我们的主进程（执行task0）和“第一个子进程”（完成task1的那些）仍然活着，尽管它们显然没有足够的python代码来执行。它们也是同一进程组（TPGID）的所有成员。在
总结一下，伙计
所有这些都是一种冗长的说法：主线程中的join通常是您所需要的，因为您可以保证任何子进程在其自身终止之前都会等待它们的子进程终止。在

网友
2楼 · 编辑于 2024-09-30 01:35:53

在类Unix的系统（Linux、BSD等）上，mp.Process实际上调用os.fork，而结果process对象的join方法调用wait（或变量）¹等待它（即等待特定于的进程，而不仅仅是任意进程）。在
一个{{}的儿童只能由其父母代为{}-代之，²所以{}可以等待每一个{>}，而不是任何一个{{}}{{>}{{}}。同时，同时，每一个{{}可以等待自己所有的{{}}s，而不是其他任何其他{{}{{{}}{{}}{{}{{{}}{{}}}}}}}{{{
因为每个task2都很短（并且每个进程在从其target=函数返回时都会退出），因此无论是否显式地join都很难看出任何区别。你需要做一些慢一点的事情（例如，time.sleep()或者做一些真正的工作）来发现真正的区别。在
¹实际调用是os.waitpid()；请参见multiprocessing/forking.py。实际调用在poll函数中。在
²如果父进程没有等待其子进程而退出，则这些子进程将“孤立”并作为代理父进程传递给PID 1（init）。进程1循环调用wait（或等效函数）来清理它们。在
（例如，Windows变体使用了不同的调用它不能fork，而且我不在Windows上工作，所以我不确定在那里情况如何。）

但是等等，我没有`join`在`task1`里！在

总结一下，伙计

相关问题更多 >

编程相关推荐

热门问题

热门文章