将xml中的<*></*>标记中的数据转储到python中的csv（多种不同的xml样式表格式）

<?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type='text/xsl' href='ANZMeta.xsl'?> <anzmeta> <citeinfo> <uniqueid /> <title><></title> <origin> <custod>ATGIS</custod> <jurisdic> <keyword thesaurus="">Tablelands Regional Council</keyword> </jurisdic> </origin> </citeinfo> <descript> <abstract><> </abstract> <theme> <keyword thesaurus="">EPSG</keyword> </theme> <spdom> <keyword thesaurus="">GDA94</keyword> <keyword thesaurus="">GRS80</keyword> <keyword thesaurus="">Map Grid of Australia</keyword> <keyword thesaurus="">Zone 55 (144E - 150E)</keyword> <bounding> <northbc /> <southbc /> <eastbc /> <westbc /> </bounding> </spdom> </descript> <timeperd> <begdate> <date>2012</date> </begdate> <enddate> <keyword thesaurus="">Completed</keyword> </enddate> </timeperd> <status> <progress> <keyword thesaurus="">Ongoing</keyword> <keyword thesaurus="">Completed</keyword> </progress> <update> <keyword thesaurus="">As Required</keyword> <keyword thesaurus="">As Required</keyword> </update> </status> <distinfo> <native> <nondig> <formname>File</formname> </nondig> <digform> <formname>Type:</formname> </digform> </native> <avlform> <nondig> <formname>Format:</formname> </nondig> <digform> <formname>Size</formname> </digform> </avlform> <accconst>Internal Use Only</accconst> </distinfo> <dataqual> <lineage>~TBC~</lineage> <procstep> <procdesc Sync="TUE">Metadata imported.</procdesc> <srcused Sync="TRUE">L:\Data_Admin\MetadataGenerator\trc_Metadata_Template.xml</srcused> <date Sync="TRUE">20121206</date> <time Sync="TRUE">15341400</time> </procstep> <posacc>~TBC~</posacc> <attracc>~TBC~</attracc> <logic>~TBC~</logic> <complete>~TBC~</complete> </dataqual> <cntinfo> <cntorg>Atherton Tablelands GIS</cntorg> <cntpos>GIS Coordinator</cntpos> <address>PO Box 1616, 8 Tolga Rd</address> <city>Atherton</city> <state>QLD</state> <country>AUSTRALIA</country> <postal>4883</postal> <cntvoice>07 40918600</cntvoice> <cntfax>07 40917035</cntfax> <cntemail>info@atgis.com.au</cntemail> </cntinfo> <metainfo> <metd> <date /> </metd> </metainfo> </anzmeta>

import os, xml, shutil, datetime from xml.etree import ElementTree as et SourceDIR=os.getcwd() outDIR=os.getcwd()+'//out' def locatexml(SourceDIR,outDIR): xmllist=[] for root, dirs, files in os.walk(SourceDIR, topdown=False): for fl in files: currentFile=os.path.join(root, fl) ext=fl[fl.rfind('.')+1:] if ext=='xml': xmllist.append(currentFile) print currentFile readxml(currentFile) print "finished" return xmllist def readxml(currentFile): tree=et.parse(currentFile) print "Processing: "+str(currentFile) locatexml(SourceDIR,outDIR) print xmllist

2条回答

网友

1楼 · 编辑于 2024-10-02 00:28:49

<anzmeta>是文档的根，因此您应该尝试find它的一个直接子级（比如citeinfo），而不是根标记名本身。在

网友

2楼 · 编辑于 2024-10-02 00:28:49

实际上，您应该使用XSLT来完成这项工作，因为它是XML到另一种格式的转换。有关示例，请参见this question的答案。在

{不过，如果你想用其他代码来做的话，{

from lxml import etree

with open('test.xml') as f:
    tree = etree.parse(f)

# At this point, we can step through the xml file
# and parse it, here is an example of the `cntinfo` tag

for element in tree.iter('cntinfo'):
    for child in element.getchildren():
        print "{0.tag}: {0.text}".format(child)

这将打印：

^{pr2}$

同样，您也可以逐步检查文件中的其他元素；但我强烈建议您使用XSLT。

此代码段将使用xslt样式表（从this question）将xml文档转换为csv：

# First, we load the stylesheet
with open(r'd:\test.xsl') as f:
    temp = etree.parse(f)
    style_sheet = etree.XSLT(temp)

# Apply it to the previously parsed document tree:
converted_xml = style_sheet(tree)

# Print the results:
str(converted_xml)

这将为您提供：

'"",    "<>",    "ATGISTablelands Regional Council"\r"<>",    "EPSG",
  "GDA94GRS80Map Grid of AustraliaZone 55 (144E - 150E)"\r"2012",    "Completed"
\r"OngoingCompleted",    "As RequiredAs Required"\r"FileType:",    "Format:Size"
,    "Internal Use Only"\r"~TBC~",    "Metadata imported.L:\\Data_Admin\\Metadat
aGenerator\\trc_Metadata_Template.xml2012120615341400",    "~TBC~",    "~TBC~",
   "~TBC~",    "~TBC~"\r"Atherton Tablelands GIS",    "GIS Coordinator",    "PO
Box 1616, 8 Tolga Rd",    "Atherton",    "QLD",    "AUSTRALIA",    "4883",    "0
7 40918600",    "07 40917035",    "info@atgis.com.au"\r""\r'

将xml中的<></>标记中的数据转储到python中的csv（多种不同的xml样式表格式）

相关问题更多 >

编程相关推荐

热门问题

热门文章

相关问题 更多 >

编程相关推荐

热门问题

热门文章

将xml中的<></>标记中的数据转储到python中的csv（多种不同的xml样式表格式）

相关问题更多 >