<p>当子进程是<code>os.fork</code>ed时,Linux使用<a href="http://en.wikipedia.org/wiki/Copy-on-write" rel="noreferrer">copy-on-write</a>来演示:</p>
<pre><code>import multiprocessing as mp
import numpy as np
import logging
import os
logger = mp.log_to_stderr(logging.WARNING)
def free_memory():
total = 0
with open('/proc/meminfo', 'r') as f:
for line in f:
line = line.strip()
if any(line.startswith(field) for field in ('MemFree', 'Buffers', 'Cached')):
field, amount, unit = line.split()
amount = int(amount)
if unit != 'kB':
raise ValueError(
'Unknown unit {u!r} in /proc/meminfo'.format(u = unit))
total += amount
return total
def worker(i):
x = data[i,:].sum() # Exercise access to data
logger.warn('Free memory: {m}'.format(m = free_memory()))
def main():
procs = [mp.Process(target = worker, args = (i, )) for i in range(4)]
for proc in procs:
proc.start()
for proc in procs:
proc.join()
logger.warn('Initial free: {m}'.format(m = free_memory()))
N = 15000
data = np.ones((N,N))
logger.warn('After allocating data: {m}'.format(m = free_memory()))
if __name__ == '__main__':
main()
</code></pre>
<p>它屈服了</p>
<pre><code>[WARNING/MainProcess] Initial free: 2522340
[WARNING/MainProcess] After allocating data: 763248
[WARNING/Process-1] Free memory: 760852
[WARNING/Process-2] Free memory: 757652
[WARNING/Process-3] Free memory: 757264
[WARNING/Process-4] Free memory: 756760
</code></pre>
<p>这表明最初大约有2.5GB的空闲内存。
在分配15000x15000个<code>float64</code>s数组后,有763248 KB空闲空间。这大概是有道理的,因为15000**2*8字节=1.8GB,内存的减少,2.5GB-0.763248GB也大约是1.8GB。</p>
<p>现在,每个进程生成后,空闲内存再次报告为~750MB。可用内存没有明显减少,因此我得出结论,系统必须使用写时拷贝。</p>
<p>结论:如果不需要修改数据,那么在<code>__main__</code>模块的全局级别定义它是在子进程之间共享数据的一种方便且(至少在Linux上)内存友好的方式。</p>