在Flask应用程序中实现PythonBoilerpipe时JVM崩溃

from boilerpipe.extract import Extractor import unicodedata class ExtractingContent: @classmethod def processingContent(self,sourceUrl,extractorType="DefaultExtractor"): extractor = Extractor(extractor=extractorType, url=sourceUrl) extractedText = extractor.getText() if extractedText: toNormalString = unicodedata.normalize('NFKD',extractedText).encode('ascii','ignore') json_data = json.loads({"content": toNormalString, "url": sourceUrl , "status": "success", "publisher_id": "XXXXX", "content_count": str(len(toNormalString)) }) return json_data else: json_data = json.dumps({"response": {"message": "No data found", "url": sourceUrl , "status": "success", "content_count": "empty" }}) return json.loads(json_data)

1条回答

网友

1楼 · 发布于 2024-09-30 16:31:42

（我在https://github.com/misja/python-boilerpipe/issues/17中所写内容的副本）

好的，我重现了这个错误：调用JVM的线程没有附加到它，因此对JVM内部的调用失败。这个bug来自锅炉管（见下文）。在

首先，monkey patching：在stackoverflow上发布的代码中，您只需在创建提取器之前添加以下代码：

class ExtractingContent:
   @classmethod
   def processingContent(self,sourceUrl,extractorType="DefaultExtractor"):
       print "State=", jpype.isThreadAttachedToJVM()

       if not jpype.isThreadAttachedToJVM():
           print "Needs to attach..."
           jpype.attachThreadToJVM()
           print "Check Attached=", jpype.isThreadAttachedToJVM()

       extractor = Extractor(extractor=extractorType, url=sourceUrl)

关于锅炉管道：第50行的boilerpipe/extractor/__init__.py中的检查if threading.activeCount() > 1是错误的。调用线程必须始终连接到JVM，即使只有一个。在

相关问题更多 >

编程相关推荐

热门问题

热门文章