将xml中的<*></*>标记中的数据转储到python中的csv（多种不同的xml样式表格式）问题的回答

将xml中的<></>标记中的数据转储到python中的csv（多种不同的xml样式表格式）

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

--即使在了解了一些XSLT之后，我也没有使用它，因为元数据/xls格式发生了变化，因此单一的基于样式表的方法将无法工作-- 在过去的几个小时里，我一直在尝试获取一个csv并将每个标签中的数据转储到csv中，但是没有任何效果。我尝试过elemtree、parse和regex，这些都是基于论坛中其他几个问题和解答的。在 For <a href="https://stackoverflow.com/questions/1412004/reading-xml-using-python-minidom-and-iterating-over-each-node">example</a>对他的测试数据很好，但对我的xml（问题末尾的示例）不起作用。在 <pre><code>tree = ET.parse("test2.xml") doc = tree.getroot() thingy = doc.find('custod') print thingy.attrib </code></pre> 回溯（最近调用最后一次）：文件“”，行 1，在AttributeError中：“NoneType”对象没有属性 '属性' ^{pr2}$ 回溯（最近一次调用）：文件“”，第1行，在AttributeError中：“NoneType”对象没有属性“attrib” <pre><code>doc.attrib {} </code></pre> ---尝试使用<a href="https://stackoverflow.com/questions/7911504/python-string-operation-extract-text-between-html-tags">REX</a> <pre><code>rex = re.compile(r'<custod.*?>(.*?)</custod>',re.S|re.M) rex <_sre.SRE_Pattern object at 0x080724A0> match=rex.match('test2.xml') match text = match.groups()[0].strip() </code></pre> 回溯（最近一次呼叫）：文件“”，第1行，输入 AttributeError:“NoneType”对象没有属性“groups” <hr/> 我所需要的就是让系统浏览我的xml文件并创建一个csv，其中包含csv列中每个标记的完整条目。如果csv不存在，则添加相应的列。在 在=========== XML示例 <pre><code><?xml version="1.0" encoding="utf-8"?> <?xml-stylesheet type='text/xsl' href='ANZMeta.xsl'?> <anzmeta> <citeinfo> <uniqueid /> <title>&lt;&gt;</title> <origin> <custod>ATGIS</custod> <jurisdic> <keyword thesaurus="">Tablelands Regional Council</keyword> </jurisdic> </origin> </citeinfo> <descript> <abstract>&lt;&gt; </abstract> <theme> <keyword thesaurus="">EPSG</keyword> </theme> <spdom> <keyword thesaurus="">GDA94</keyword> <keyword thesaurus="">GRS80</keyword> <keyword thesaurus="">Map Grid of Australia</keyword> <keyword thesaurus="">Zone 55 (144E - 150E)</keyword> <bounding> <northbc /> <southbc /> <eastbc /> <westbc /> </bounding> </spdom> </descript> <timeperd> <begdate> <date>2012</date> </begdate> <enddate> <keyword thesaurus="">Completed</keyword> </enddate> </timeperd> <status> <progress> <keyword thesaurus="">Ongoing</keyword> <keyword thesaurus="">Completed</keyword> </progress> <update> <keyword thesaurus="">As Required</keyword> <keyword thesaurus="">As Required</keyword> </update> </status> <distinfo> <native> <nondig> <formname>File</formname> </nondig> <digform> <formname>Type:</formname> </digform> </native> <avlform> <nondig> <formname>Format:</formname> </nondig> <digform> <formname>Size</formname> </digform> </avlform> <accconst>Internal Use Only</accconst> </distinfo> <dataqual> <lineage>~TBC~</lineage> <procstep> <procdesc Sync="TUE">Metadata imported.</procdesc> <srcused Sync="TRUE">L:\Data_Admin\MetadataGenerator\trc_Metadata_Template.xml</srcused> <date Sync="TRUE">20121206</date> <time Sync="TRUE">15341400</time> </procstep> <posacc>~TBC~</posacc> <attracc>~TBC~</attracc> <logic>~TBC~</logic> <complete>~TBC~</complete> </dataqual> <cntinfo> <cntorg>Atherton Tablelands GIS</cntorg> <cntpos>GIS Coordinator</cntpos> <address>PO Box 1616, 8 Tolga Rd</address> <city>Atherton</city> <state>QLD</state> <country>AUSTRALIA</country> <postal>4883</postal> <cntvoice>07 40918600</cntvoice> <cntfax>07 40917035</cntfax> <cntemail>info@atgis.com.au</cntemail> </cntinfo> <metainfo> <metd> <date /> </metd> </metainfo> </anzmeta> </code></pre> ---开始我的脚本 <pre><code>import os, xml, shutil, datetime from xml.etree import ElementTree as et SourceDIR=os.getcwd() outDIR=os.getcwd()+'//out' def locatexml(SourceDIR,outDIR): xmllist=[] for root, dirs, files in os.walk(SourceDIR, topdown=False): for fl in files: currentFile=os.path.join(root, fl) ext=fl[fl.rfind('.')+1:] if ext=='xml': xmllist.<a href="https://www.cnpython.com/list/append" class="inner-link">append</a>(currentFile) print currentFile readxml(currentFile) print "finished" return xmllist def readxml(currentFile): tree=et.parse(currentFile) print "Processing: "+str(currentFile) locatexml(SourceDIR,outDIR) print xmllist </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

将xml中的<*></*>标记中的数据转储到python中的csv（多种不同的xml样式表格式）

1 个回答

相关Python问题

将xml中的<></>标记中的数据转储到python中的csv（多种不同的xml样式表格式）