Python中从XML到字典到数据帧的转换

<BREVIER> <BRV> <MONO>stuff</MONO> <TITD>stuff</TITD> <TITF>Blabla</TITF> <CMPD>stuff</CMPD> <CMPF>stuff</CMPF> <INDD>stuff</INDD> <INDF>Blablo</INDF> <CINDD>stuff</CINDD> <CINDF>stuff</CINDF> <POSD>stuff</POSD> <POSF>stuff</POSF> <DEL>true</DEL> </BRV>

# encoding: utf-8 import xmltodict import pprint import json import pandas as pd with open('Brevier.xml',encoding='UTF-8','rb') as fd: my_dict = xmltodict.parse(fd.read(),encoding='UTF-8') tableau_indic=pd.DataFrame() for section in my_dict ['BREVIER']['BRV']: drugname = section.get('TITF') print(drugname in tableau_indic.loc(["Nom_du_medicament"])) drugindication = section.get('INDF') print(drugindication in tableau_indic.loc(["Indication"])) print(tableau_indic) fd.close()

# encoding: utf-8 import xmltodict import pprint import json import pandas as pd with open('Brevier.xml',encoding='UTF-8') as fd: my_dict = xmltodict.parse(fd.read(),encoding='UTF-8') tableau_indic=pd.DataFrame for section in my_dict ['BREVIER']['BRV']: drugname = section.get('TITF') print(tableau_indic.loc["Nom_du_medicament"]) drugindication = section.get('INDF') print(tableau_indic.loc["Indication"]) print(tableau_indic) fd.close()

1条回答

网友

1楼 · 发布于 2024-10-02 20:43:56

有几种方法可以实现它，但基本上，由于您处理的是xml文件，所以最好使用xpath之类的xml工具

假设您的xml如下所示：

meds = """<BREVIER>
  <BRV>
    <MONO>stuff</MONO>
    <TITF>Blabla</TITF>
    <CMPD>stuff</CMPD>
    <INDF>Blablo</INDF>
    <CINDD>stuff</CINDD>
    <DEL>true</DEL>
  </BRV>
  <BRV>
    <MONO>stuff</MONO>
    <TITF>Blabla 2</TITF>
    <CMPD>stuff</CMPD>
    <INDF>Blablo 2</INDF>
    <CINDD>stuff</CINDD>
    <DEL>true</DEL>
  </BRV>
</BREVIER>"""

您可以使用lxml来处理它：

from lxml import etree
doc = etree.XML(meds)
print('Nom_du_medicament Indication')
for m in doc.xpath('//BRV'):
      print(m.xpath('./TITF/text()')[0], m.xpath('./INDF/text()')[0])

输出：

Nom_du_medicament Indication
Blabla Blablo
Blabla 2 Blablo 2

从这里，您可以格式化输出，将其加载到数据帧或任何东西中

相关问题更多 >

编程相关推荐

热门问题

热门文章