TypeError:“MapResult”对象无法使用Paths.multiprocessing进行编辑

2024-10-04 03:29:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在对我拥有的数据集运行拼写更正函数。我用from pathos.multiprocessing import ProcessingPool as Pool来做这项工作。一旦处理完成,我想实际访问结果。这是我的密码:

import codecs
import nltk

from textblob import TextBlob
from nltk.tokenize import sent_tokenize
from pathos.multiprocessing import ProcessingPool as Pool

class SpellCorrect():

    def load_data(self, path_1):
        with codecs.open(path_1, "r", "utf-8") as file:
            data = file.read()
        return sent_tokenize(data)

    def correct_spelling(self, data):
        data = TextBlob(data)
        return str(data.correct())

    def run_clean(self, path_1):
        pool = Pool()
        data = self.load_data(path_1)
        return pool.amap(self.correct_spelling, data)

if __name__ == "__main__":
    path_1 = "../Data/training_data/training_corpus.txt"
    SpellCorrect = SpellCorrect()
    result = SpellCorrect.run_clean(path_1)
    print(result)
    result = " ".join(temp for temp in result)
    with codecs.open("../Data/training_data/training_data_spell_corrected.txt", "a", "utf-8") as file:
        file.write(result)

如果您查看主块,当我执行print(result)操作时,会得到类型为<multiprocess.pool.MapResult object at 0x1a25519f28>的对象

我尝试使用result = " ".join(temp for temp in result)访问结果,但随后出现以下错误TypeError: 'MapResult' object is not iterable。我试着将它键入一个列表list(result),但仍然是相同的错误。我能做些什么来解决这个问题


Tags: pathfromimportselfdatadefastraining
1条回答
网友
1楼 · 发布于 2024-10-04 03:29:16

multiprocess.pool.MapResult对象是不可iterable的,因为它是从AsyncResult继承的,并且只有以下方法:

  • 等待([超时]) 等待结果可用或等待超时秒数过去。此方法始终不返回任何值

  • ready()返回调用是否已完成

  • successful() 例外如果结果未就绪,将引发AssertionError

  • get([timeout])在结果到达时返回结果。如果不是超时 无,并且结果不会在超时秒内到达 TimeoutError被引发。如果远程调用引发异常,则 get()将该异常作为RemoteError重新调用

您可以在此处查看如何使用get()函数的示例: https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers

from multiprocessing import Pool, TimeoutError
import time
import os

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)              # start 4 worker processes

    # print "[0, 1, 4,..., 81]"
    print pool.map(f, range(10))

    # print same numbers in arbitrary order
    for i in pool.imap_unordered(f, range(10)):
        print i

    # evaluate "f(20)" asynchronously
    res = pool.apply_async(f, (20,))      # runs in *only* one process
    print res.get(timeout=1)              # prints "400"

    # evaluate "os.getpid()" asynchronously
    res = pool.apply_async(os.getpid, ()) # runs in *only* one process
    print res.get(timeout=1)              # prints the PID of that process

    # launching multiple evaluations asynchronously *may* use more processes
    multiple_results = [pool.apply_async(os.getpid, ()) for i in range(4)]
    print [res.get(timeout=1) for res in multiple_results]

    # make a single worker sleep for 10 secs
    res = pool.apply_async(time.sleep, (10,))
    try:
        print res.get(timeout=1)
    except TimeoutError:
        print "We lacked patience and got a multiprocessing.TimeoutError"

相关问题 更多 >