Apache Beam python无法解析pubmed XML

2024-10-01 04:44:43 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨,我写了一个beam管道来读取一个目录并使用pubmed_解析库解析下载的pubmed xml文件。该库通过标准python程序运行良好,但如果我将其转换为apache beam pipeline,则在解析过程中失败,并出现错误: 希望你能帮我解决这个问题

File "/home/micdsouz/venv/medline/data-preprocessing.py", line 19, in process
    pubmed_dict = pp.parse_pubmed_xml(element)
  File "/home/micdsouz/venv/local/lib/python2.7/site-packages/pubmed_parser/pubmed_oa_parser.py", line 112, in parse_pubmed_xml
    dict_article_meta = parse_article_meta(tree)
  File "/home/micdsouz/venv/local/lib/python2.7/site-packages/pubmed_parser/pubmed_oa_parser.py", line 60, in parse_article_meta
    pmid_node = article_meta.find('article-id[@pub-id-type="pmid"]')
AttributeError: 'NoneType' object has no attribute 'find' [while running 'ReadData']

^{pr2}$

Tags: inpyparserhomevenvparselinearticle