<p>由于<code>multithreading.Pool</code>无法释放大约1*Gb的内存,我也尝试用<code>ThreadPool</code>替换它,但没有更好的效果。我仍然想知道池中的内存泄漏问题</p>
<p>这可能不是最好的解决方案,但可以是一种变通解决方案</p>
<p>通过不使用<code>ThreadPool</code>或<code>ProcessPool</code>,我正在手动创建线程或进程,并为每个线程或进程分配要转换为HSV的图像。好的,我已经对行<code>p = multiprocessing.Process(target=do_hsv, args=(imgs[j], shared_list))</code>进行了注释,因为它将为每个图像转换生成新的进程,我认为这将是一种过度杀伤力,并且比线程要昂贵得多。显然,手动创建线程将比<code>ThreadPool</code>(4秒,但<strong>内存泄漏)花费更多的时间(9秒<strong>无内存泄漏</strong>),但正如您所看到的,它在内存上几乎保持平静</p>
<p>这是我的密码:</p>
<pre><code>import multiprocessing
import os
import threading
import time
from memory_profiler import profile
import numpy as np
from skimage.color import rgb2hsv
def do_hsv(img, shared_list):
shared_list.append(rgb2hsv(img))
# print("Converted by process {} having parent process {}".format(os.getpid(), os.getppid()))
@profile
def parallel_convert_all_to_hsv(imgs, shared_list):
cores = os.cpu_count()
starttime = time.time()
for i in range(0, len(imgs), cores):
# print("i :", i)
jobs = []; pipes = []
end = i + cores if (i + cores) <= len(imgs) else i + len(imgs[i : -1]) + 1
# print("end :", end)
for j in range(i, end):
# print("j :", j)
# p = multiprocessing.Process(target=do_hsv, args=(imgs[j], shared_list))
p = threading.Thread(target= do_hsv, args=(imgs[j], shared_list))
jobs.append(p)
for p in jobs: p.start()
for proc in jobs:
proc.join()
print("Took {} seconds to complete ".format(starttime - time.time()))
return 1
@profile
def doit():
print("create random images")
max_images = 700
images = np.random.rand(max_images, 300, 300,3)
# images = [x for x in range(0, 10000)]
manager = multiprocessing.Manager()
shared_list = manager.list()
parallel_convert_all_to_hsv(images, shared_list)
del images
del shared_list
print()
doit()
</code></pre>
<p>以下是输出:</p>
<pre><code>create random images
Took -9.085552453994751 seconds to complete
Filename: MemoryNotFreed.py
Line # Mem usage Increment Line Contents
================================================
15 1549.1 MiB 1549.1 MiB @profile
16 def parallel_convert_all_to_hsv(imgs, shared_list):
17
18 1549.1 MiB 0.0 MiB cores = os.cpu_count()
19
20 1549.1 MiB 0.0 MiB starttime = time.time()
21
22 1566.4 MiB 0.0 MiB for i in range(0, len(imgs), cores):
23
24 # print("i :", i)
25
26 1566.4 MiB 0.0 MiB jobs = []; pipes = []
27
28 1566.4 MiB 0.0 MiB end = i + cores if (i + cores) <= len(imgs) else i + len(imgs[i : -1]) + 1
29
30 # print("end :", end)
31
32 1566.4 MiB 0.0 MiB for j in range(i, end):
33 # print("j :", j)
34
35 # p = multiprocessing.Process(target=do_hsv, args=(imgs[j], shared_list))
36 1566.4 MiB 0.0 MiB p = threading.Thread(target= do_hsv, args=(imgs[j], shared_list))
37
38 1566.4 MiB 0.0 MiB jobs.append(p)
39
40 1566.4 MiB 0.8 MiB for p in jobs: p.start()
41
42 1574.9 MiB 1.0 MiB for proc in jobs:
43 1574.9 MiB 13.5 MiB proc.join()
44
45 1563.5 MiB 0.0 MiB print("Took {} seconds to complete ".format(starttime - time.time()))
46 1563.5 MiB 0.0 MiB return 1
Filename: MemoryNotFreed.py
Line # Mem usage Increment Line Contents
================================================
48 106.6 MiB 106.6 MiB @profile
49 def doit():
50
51 106.6 MiB 0.0 MiB print("create random images")
52
53 106.6 MiB 0.0 MiB max_images = 700
54
55 1548.7 MiB 1442.1 MiB images = np.random.rand(max_images, 300, 300,3)
56
57 # images = [x for x in range(0, 10000)]
58 1549.0 MiB 0.3 MiB manager = multiprocessing.Manager()
59 1549.1 MiB 0.0 MiB shared_list = manager.list()
60
61 1563.5 MiB 14.5 MiB parallel_convert_all_to_hsv(images, shared_list)
62
63 121.6 MiB 0.0 MiB del images
64
65 121.6 MiB 0.0 MiB del shared_list
66
67 121.6 MiB 0.0 MiB print()
</code></pre>