在Python3.5中尝试使程序多线程失败

2024-09-29 23:30:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图在Ubuntu操作系统中使用Python3.5将程序的输出写入文件。下面是我在尝试多线程之前首先尝试的内容

from fuzzywuzzy import process, fuzz
import ast

def people(email):


       #Checking the names of people with fuzzywuzzy library of python

    return([returns result])

writel = open (r'output.csv','w',encoding='utf-8',errors='ignore')

with open ('emailfile.txt','r',encoding='ascii',errors='ignore') as Filepointer:
    result = []
    for line in Filepointer.readlines():
        count += 1

        data = people(line.strip())

        if data is not "":
            result.append(data)
for data in result:
   writel.write(str(data)   + "\n")

writel.close()    

然后,我尝试使用以下代码在python 3上执行多线程:

from fuzzywuzzy import process, fuzz
import ast
from concurrent.futures import ThreadPoolExecutor
import threading
global FinalOutput
def people(email):


       #Checking the names of people with fuzzywuzzy library of python

    FinalOutput.append([appends returned result])
    print (FinalOutput)
    return




threads = []
writel = open (r'output.csv','w',encoding='utf-8',errors='ignore')
count = 0
pool = ThreadPoolExecutor(max_workers=10)
with open ('emailfile.txt','r',encoding='ascii',errors='ignore') as Filepointer:   
    for line in Filepointer.readlines():        
        pool.submit(people,line.strip())
pool.shutdown(wait=True)                
for data in FinalOutput:
   writel.write(str(data)   + "\n")

writel.close()    

以上代码产生以下错误:

Segmentation fault (core dumped)

我在StackOverflow中查看了与此问题相关的线程,但没有找到解决方案。我还是会犯同样的错误。
好心的,让我知道我需要做什么使代码运行


Tags: ofimportfordatawithlineresultopen
1条回答
网友
1楼 · 发布于 2024-09-29 23:30:05

Python有一个很棒的并行化工具,叫做多处理池。它不是多线程,而是并行化,这似乎是您的意图。我们要做的是使people返回一个值,而不是将结果附加到全局变量:

def people(email):
    # This is where the magic happens
    return result

从那里我们可以创建一个Pool并调用它的map函数,该函数自动分配iterable返回的值,并按它们在iterable中的顺序在列表中返回:

from multiprocessing import Pool

with open(r'output.csv','w',encoding='utf-8',errors='ignore') as FilePointer:
    with Pool() as pool:
        FinalOutput = pool.map(people, FilePointer.readlines())

with open(r'output.csv', 'w', encoding='utf-8', errors='ignore') as writel:
    for data in FinalOutput:
        writel.write(str(data) + '\n')

您还可以研究一个名为joblib的包,它有一个函数,可以以更整洁、更灵活的方式实现这一点

相关问题 更多 >

    热门问题