Python多处理:Pool.map()似乎不调用函数

2024-05-19 11:29:58 发布

您现在位置:Python中文网/ 问答频道 /正文

我对多线程很陌生,所以如果是基本的,我很抱歉。我有一些OCRs图像文件的功能,我想多线程的任务。函数不返回任何内容,但只保存OCR数据集的文本。代码如下:

start_time = time.time()
path = 'C:\\Users\\RNCZF01\\Documents\\Cameron-Fen\\Economics-Projects\\Patent-project\\similarity\\Patents\\OCR-test'
listfiles = os.listdir(path)

filterfiles = [p for p in listfiles if p[-4:] == '.tif']

pool = Pool(processes=2)

result = pool.map(OCRimage,filterfiles)

pool.close()
pool.join()

print("--- %s seconds ---" % (time.time() - start_time))

当我运行代码时,它似乎被卡住了pool.map()。我运行了30分钟,这比试验过程花费的时间要长得多,而且它不是单输出的。我测试了我的函数OCRimage,但它似乎没有一次进入函数(使用print(1)作为OCRimage代码的第一行)。我在想是否有人能帮我。谢谢

卡梅伦

编辑(添加了OCRimage函数):

OCRimage函数如下所示:

def OCRimage(f):
    #This runs the magick bash script which splits a multi-image tif into multiple single image tiffs
    process = subprocess.Popen(["magick", path + "\\" + f, path + "\\temp\\%d.tif"], shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    print(process.communicate()[0])

    #finds the number of pages for each tiff file (this might not be necassary but the all files in directory python command could access files randomly)
    max1 = -1
    for filename in os.listdir(path+'\\temp'):    
        if (max1 < int(filename[0:-4])):
            max1 = int(filename[0:-4])
    max1 = max1 + 1

    text = ""
    for each in range(0,max1):
        im = Image.open(path + "\\temp\\"+ str(each) + ".tif")
        text = text + pytesseract.image_to_string(im)
    with open(path + "\\result\\OCR-"+f[0:-4]+".txt", 'w') as file:
        file.write(text)    

    for f in os.listdir(path+'\\temp'):
        os.remove(path + '\\temp\\' + f)

编辑2:这是所有的进口货

import time
import subprocess
import os
import pytesseract
from PIL import Image

from multiprocessing import Pool
import multiprocessing
countcpus = multiprocessing.cpu_count()

编辑3:

只运行OCRimage(f)本身可以很好地工作。与多线程代码不同,我只使用以下代码:

path = 'C:\\Users\\RNCZF01\\Documents\\Cameron-Fen\\Economics-Projects\\Patent-project\\similarity\\Patents\\OCR-test'
for p in os.listdir(path):
    OCRimage(p)

Tags: path函数代码inimportfortimeos

热门问题