在Python中将xml转换为csv

2024-10-03 23:25:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下XML结构,我正在尝试用python将其转换为csv:

<FIXML><Batch>
<PosRpt RptID="34868232064" ReqID="C905EOD20160427" SetSesID="EOD" MtchStat="0" PriSetPx="326.6" SetPx="328.3" SetPxTyp="1" SettlCcy="USD" ReqTyp="1" MsgEvtSrc="REG" BizDt="2016-04-27" SettlDt="2016-07-14" SettlCurrFxRt="1"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CBT" R="22"></Pty><Pty ID="905" R="38"><Sub ID="1" Typ="26"/></Pty><Pty ID="905" R="1"></Pty><Instrmt ID="06" Desc="SOYBEAN MEAL FUTURES" CFI="FCAPSO" SecTyp="FUT" Src="H" MMY="201607" MatDt="2016-07-14" Mult="100" Exch="CBT" UOM="tn" UOMQty="100" PxUOM="TON" PxUOMQty="1" ValMeth="FUT" Fctr="1" PxQteCcy="USD" FnlSettlCcy="USD"></Instrmt><Qty Long="2038" Short="1354" Typ="ETR"/><Qty Long="1289" Short="1436" Typ="ALC"/><Qty Long="0" Short="10" Typ="TRF"/><Qty Long="4122" Short="8098" Typ="SOD"/><Qty Long="3957" Short="7406" Typ="FIN"/><Qty Long="937" Short="6325" Typ="IES"/><Qty Long="35" Short="55" Typ="IAS"/><Amt Typ="SMTM" Amt="-675920" Ccy="USD"/><Amt Typ="TVAR" Amt="-325070.33" Ccy="USD"/><Amt Typ="FMTM" Amt="-1000990.33" Ccy="USD"/></PosRpt>
<TrdCaptRpt RptID="21195360680" TrdTyp="0" TrdSubTyp="5" ExecID="85271320160426220810TN0002521" TrdDt="2016-04-27" BizDt="2016-04-27" MLegRptTyp="1" MtchStat="0" MsgEvtSrc="REG" TrdID="106695" LastQty="1" LastPx="323.5" TxnTm="2016-04-27T01:10:25-05:00" SettlCcy="USD" SettlDt="2016-07-14" PxSubTyp="1" VenueTyp="E" VenuTyp="E" OfstInst="0"><Instrmt ID="06" Desc="SOYBEAN MEAL FUTURES" CFI="FCAPSO" SecTyp="FUT" MMY="201607" MatDt="2016-07-14" Mult="100" Exch="CBT" UOM="tn" UOMQty="100" PxUOM="TON" PxUOMQty="1" ValMeth="FUT" Fctr="1" PxQteCcy="USD"></Instrmt><Amt Typ="TVAR" Amt="480" Ccy="USD"/><RptSide Side="1" ClOrdID="25245816" CustCpcty="4" OrdTyp="M" SesID="EOD" SesSub="E" AllocInd="1" AgrsrInd="Y"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CBT" R="22"></Pty><Pty ID="905" R="1"></Pty><Pty ID="434GU400" R="24"><Sub ID="1" Typ="26"/></Pty><Pty ID="4QOL" R="12"></Pty><Pty ID="685" R="17"></Pty><Pty ID="4QOL" R="37"></Pty><Pty ID="905" R="38"><Sub ID="1" Typ="26"/></Pty><Pty ID="905" R="7"></Pty><RegTrdID ID="FECC1544943BFEC0302D5F8342" Src="1010000023" Typ="0" Evnt="2"/></RptSide></TrdCaptRpt>
<TrdCaptRpt RptID="21196531008" TrdTyp="0" TrdSubTyp="5" ExecID="88421020160427065733TN0007200" TrdDt="2016-04-27" BizDt="2016-04-27" MLegRptTyp="1" MtchStat="0" MsgEvtSrc="REG" TrdID="115357" LastQty="2" LastPx="325.7" TxnTm="2016-04-27T07:00:12-05:00" SettlCcy="USD" SettlDt="2016-07-14" PxSubTyp="1" VenueTyp="E" VenuTyp="E" OfstInst="0"><Instrmt ID="06" Desc="SOYBEAN MEAL FUTURES" CFI="FCAPSO" SecTyp="FUT" MMY="201607" MatDt="2016-07-14" Mult="100" Exch="CBT" UOM="tn" UOMQty="100" PxUOM="TON" PxUOMQty="1" ValMeth="FUT" Fctr="1" PxQteCcy="USD"></Instrmt><Amt Typ="TVAR" Amt="-520" Ccy="USD"/><RptSide Side="2" ClOrdID="25246712" CustCpcty="4" OrdTyp="M" SesID="EOD" SesSub="E" AllocInd="1" AgrsrInd="Y"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CBT" R="22"></Pty><Pty ID="905" R="1"></Pty><Pty ID="434GU400" R="24"><Sub ID="1" Typ="26"/></Pty><Pty ID="4QOL" R="12"></Pty><Pty ID="685" R="17"></Pty><Pty ID="4QOL" R="37"></Pty><Pty ID="905" R="38"><Sub ID="1" Typ="26"/></Pty><Pty ID="905" R="7"></Pty><RegTrdID ID="FECC1544943BFEC0302D64A564" Src="1010000023" Typ="0" Evnt="2"/></RptSide></TrdCaptRpt>
<PosRpt RptID="34868266266" ReqID="C905EOD20160427" SetSesID="EOD" MtchStat="0" PriSetPx="136" SetPx="136" SetPxTyp="1" SettlCcy="USD" ReqTyp="1" MsgEvtSrc="REG" BizDt="2016-04-27" SettlDt="2016-12-28" SettlCurrFxRt="1"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CBT" R="22"></Pty><Pty ID="99106105" R="38"><Sub ID="2" Typ="26"/></Pty><Pty ID="905" R="1"></Pty><Instrmt ID="UFU" Desc="UAN FOB NOLA SWAP" CFI="FCACSO" SecTyp="FUT" Src="H" MMY="201612" MatDt="2016-12-28" Mult="100" Exch="CBT" UOM="tn" UOMQty="100" PxUOM="TON" PxUOMQty="1" ValMeth="FUT" Fctr="1" PxQteCcy="USD" FnlSettlCcy="USD"></Instrmt><Qty Long="30" Short="0" Typ="SOD"/><Qty Long="30" Short="0" Typ="FIN"/><Qty Long="30" Short="0" Typ="IES"/><Amt Typ="SMTM" Amt="0" Ccy="USD"/><Amt Typ="TVAR" Amt="0" Ccy="USD"/><Amt Typ="FMTM" Amt="0" Ccy="USD"/><RegTrdID ID="PSC152CEF79387P0203D81FA" Src="1010000023" Typ="0" Evnt="2"/></PosRpt>
<PosRpt RptID="34868372999" ReqID="C905EOD20160427" SetSesID="EOD" MtchStat="0" PriSetPx="675.25" SetPx="669.25" SetPxTyp="1" SettlCcy="USD" ReqTyp="1" MsgEvtSrc="REG" BizDt="2016-04-27" SettlDt="2016-06-30" SettlCurrFxRt="1"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CME" R="22"></Pty><Pty ID="98812736" R="38"><Sub ID="2" Typ="26"/></Pty><Pty ID="905" R="1"></Pty><Instrmt ID="CPC" Desc="MALAYSIAN CRUDE PALM OIL CAL S" CFI="FCACSO" SecTyp="FUT" Src="H" MMY="201606" MatDt="2016-06-30" Mult="25" Exch="CME" UOMQty="25" PxUOM="MTONS" PxUOMQty="1" ValMeth="FUT" Fctr="1" PxQteCcy="USD" FnlSettlCcy="USD"></Instrmt><Qty Long="0" Short="200" Typ="SOD"/><Qty Long="0" Short="200" Typ="FIN"/><Qty Long="0" Short="200" Typ="IES"/><Amt Typ="SMTM" Amt="30000" Ccy="USD"/><Amt Typ="TVAR" Amt="0" Ccy="USD"/><Amt Typ="FMTM" Amt="30000" Ccy="USD"/><RegTrdID ID="PSC154373D5298P0302DFC70" Src="1010000023" Typ="0" Evnt="2"/></PosRpt>
<PosRpt RptID="34868373000" ReqID="C905EOD20160427" SetSesID="EOD" MtchStat="0" PriSetPx="665.75" SetPx="661.5" SetPxTyp="1" SettlCcy="USD" ReqTyp="1" MsgEvtSrc="REG" BizDt="2016-04-27" SettlDt="2016-11-30" SettlCurrFxRt="1"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CME" R="22"></Pty><Pty ID="98812736" R="38"><Sub ID="2" Typ="26"/></Pty><Pty ID="905" R="1"></Pty><Instrmt ID="CPC" Desc="MALAYSIAN CRUDE PALM OIL CAL S" CFI="FCACSO" SecTyp="FUT" Src="H" MMY="201611" MatDt="2016-11-30" Mult="25" Exch="CME" UOMQty="25" PxUOM="MTONS" PxUOMQty="1" ValMeth="FUT" Fctr="1" PxQteCcy="USD" FnlSettlCcy="USD"></Instrmt><Qty Long="0" Short="400" Typ="SOD"/><Qty Long="0" Short="400" Typ="FIN"/><Qty Long="0" Short="400" Typ="IES"/><Amt Typ="SMTM" Amt="42500" Ccy="USD"/><Amt Typ="TVAR" Amt="0" Ccy="USD"/><Amt Typ="FMTM" Amt="42500" Ccy="USD"/><RegTrdID ID="PSC1540E0A7EA6P0302DFB8E" Src="1010000023" Typ="0" Evnt="2"/></PosRpt>
<TrdCaptRpt RptID="21202575211" TrdTyp="0" TrdDt="2016-04-27" BizDt="2016-04-27" MLegRptTyp="2" MtchStat="1" MsgEvtSrc="REG" TrdID="000991" LastQty="100" LastPx="0.31" TxnTm="2016-04-27T12:33:54-05:00" SettlCcy="USD" SettlDt="2016-08-03" OrigTrdID="15457C3D779LEB0202D1BC6" PxSubTyp="1" VenueTyp="P" VenuTyp="P"><Instrmt ID="DA" Desc="CLASS III MILK OPTIONS" CFI="OCAXPS" SecTyp="OOF" MMY="201607" MatDt="2016-08-03" StrkPx="13.75" Mult="2000" Exch="CME" UOM="lbs" UOMQty="200000" PxUOM="LBS" PxUOMQty="100" PutCall="1" ValMeth="EQTY" Fctr="1" PxQteCcy="USD"></Instrmt><Undly CFI="FCACSO" Desc="CLASS III MILK FUTURES" ID="DA" Src="H" MMY="201607" SecTyp="FUT" Exch="CME"></Undly><Amt Typ="PREM" Amt="62000" Ccy="USD"/><RptSide Side="2" ClOrdID="660" CustCpcty="4" OrdTyp="L" SesID="EOD" SesSub="P" TmBkt="V" AllocInd="1" AgrsrInd="Y"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CME" R="22"></Pty><Pty ID="905" R="1"></Pty><Pty ID="77040322" R="24"><Sub ID="1" Typ="26"/></Pty><Pty ID="BKF" R="12"></Pty><Pty ID="826" R="17"></Pty><Pty ID="BLT" R="37"></Pty><Pty ID="905" R="38"><Sub ID="1" Typ="26"/></Pty><Pty ID="905" R="7"></Pty><RegTrdID ID="FECC15457C3D779LEB0202D1BC8" Src="1010000023" Typ="0" Evnt="2"/></RptSide></TrdCaptRpt>
<TrdCaptRpt RptID="21206412158" TrdTyp="0" TrdSubTyp="5" TrdDt="2016-04-27" BizDt="2016-04-27" MLegRptTyp="1" MtchStat="1" MsgEvtSrc="REG" TrdID="124710" LastQty="5" LastPx="0.13" SettlCcy="USD" SettlDt="2016-08-31" PxSubTyp="1" VenueTyp="P" VenuTyp="P" OfstInst="0"><Instrmt ID="DA" Desc="CLASS III MILK OPTIONS" CFI="OPAXPS" SecTyp="OOF" MMY="201608" MatDt="2016-08-31" StrkPx="13" Mult="2000" Exch="CME" UOM="lbs" UOMQty="200000" PxUOM="LBS" PxUOMQty="100" PutCall="0" ValMeth="EQTY" Fctr="1" PxQteCcy="USD"></Instrmt><Undly CFI="FCACSO" Desc="CLASS III MILK FUTURES" ID="DA" Src="H" MMY="201608" SecTyp="FUT" Exch="CME"></Undly><Amt Typ="PREM" Amt="1300" Ccy="USD" SettlDt="2016-04-27"/><RptSide Side="2" ClOrdID="726" CustCpcty="4" OrdTyp="L" SesID="EOD" SesSub="P" TmBkt="M" AllocInd="1" AgrsrInd="Y"><Pty ID="CME" R="21"></Pty><Pty ID="905" R="4"></Pty><Pty ID="CME" R="22"></Pty><Pty ID="905" R="1"></Pty><Pty ID="7704038A" R="24"><Sub ID="1" Typ="26"/></Pty><Pty ID="GRTY" R="12"></Pty><Pty ID="888" R="17"></Pty><Pty ID="GRTY" R="37"></Pty><Pty ID="905" R="38"><Sub ID="1" Typ="26"/></Pty><Pty ID="905" R="7"></Pty><RegTrdID ID="FECC1544943BFEC0302D9028AE" Src="1010000023" Typ="0" Evnt="2"/></RptSide></TrdCaptRpt>
</Batch></FIXML>

我正试着把它转换成一个csv文件。我用以下代码尝试过,但无法获得正确的输出:

^{pr2}$

我不能把所有的标签都放到csv里。有没有一种简单的方法将其导出到CSV?在

谢谢你


Tags: srciddesclongptyshortusdqty
2条回答

你应该先把每一行的所有标签都推到一个列表中。在

for node in tree.iter('TrdCaptRpt'):

    .....

    my_list.push([RptID, TrdTyp, TrdSubTyp, TrdDt, BizDt, 
                  MLegRptTyp, MtchStat, MsgEvtSrc, TrdID, 
                  LastQty, LastPx, TxnTm, SettlCcy, SettlDt, 
                  PxSubTyp, VenueTyp, VenuTyp, OfstInst])

然后将每一行写入文件:

^{pr2}$

使用csv.DictWriter,从node.attrib字典中获取值

名为TrdCapRpt的元素具有属性,如果有这样的节点,则其属性node.attrib 保存包含每个属性的键/值的字典。在

csv.DictWriter允许写入字典中的数据。在

首先是一些导入(我总是使用lxml,因为它非常快,并且提供了额外的功能):

from lxml import etree
import csv

配置要在每个记录中使用的文件名和字段:

^{pr2}$

阅读XML:

xml = etree.parse(xml_fname)

迭代元素“TrdCapRpt”,将属性值写入CSV文件:

with open(csv_fname, "w") as f:

    writer = csv.DictWriter(f, fields, delimiter=";", extrasaction="ignore")
    writer.writeheader()
    for node in xml.iter("TrdCaptRpt"):
        writer.writerow(node.attrib)

如果您喜欢使用stdlib xml.etree.ElementTree,那么您应该像现在这样轻松地管理,因为{}也在那里。在

从多个元素名称读取

在您的评论中,您注意到,您希望从more导出属性 元素名称。这也是可能的。为此,我将示例修改为 使用xpath(这可能只适用于lxml)并添加额外的列 "elm_name"要跟踪,从哪个元素创建记录:

fields = [
    "elm_name",

    "RptID", "TrdTyp", "TrdSubTyp", "ExecID", "TrdDt", "BizDt", "MLegRptTyp",
    "MtchStat" "MsgEvtSrc", "TrdID", "LastQty", "LastPx", "TxnTm", "SettlCcy",
    "SettlDt", "PxSubTyp", "VenueTyp", "VenuTyp", "OfstInst",

    "Typ", "Amt", "Ccy"
]

xml = etree.parse(xml_fname)

with open(csv_fname, "w") as f:

    writer = csv.DictWriter(f, fields, delimiter=";", extrasaction="ignore")
    writer.writeheader()
    for node in xml.xpath("//*[self::TrdCaptRpt or self::PosRpt or self::Amt]"):
        atts = node.attrib
        atts["elm_name"] = node.tag
        writer.writerow(node.attrib)

修改内容包括:

  • fields得到了额外的"elm_name"字段和其他元素的字段(请随意删除那些您不感兴趣的)。在
  • 使用xml.xpath迭代元素。XPath表达式更复杂,所以我不确定stdlib ElementTree是否支持它。在
  • 在编写记录之前,我将元素的名称添加到atts字典中以提供元素的名称。在

警告:元素Amt嵌套在PosRpt和这个树结构中 无法在CSV中支持。记录是写下来的,但不成立 关于他们来自哪里的信息(除了跟踪记录 父元素)。在

相关问题 更多 >