<p>在阅读了一些链接之后,我想出了迭代解析的解决方案。但是,我无法从RAM使用方面找出简单解析和iterparse之间的区别。在</p>
<p>重要链接:<br/>
-<a href="http://www.ibm.com/developerworks/xml/library/x-hiperfparse/" rel="nofollow noreferrer">http://www.ibm.com/developerworks/xml/library/x-hiperfparse/</a><br/>
-<a href="https://stackoverflow.com/questions/9856163/using-lxml-and-iterparse-to-parse-a-big-1gb-xml-file">using lxml and iterparse() to parse a big (+- 1Gb) XML file</a></p>
<p>代码:</p>
<p>进口lxml.etree作为et</p>
<pre><code>graphml = {
"graph": "{http://graphml.graphdrawing.org/xmlns}graph",
"node": "{http://graphml.graphdrawing.org/xmlns}node",
"edge": "{http://graphml.graphdrawing.org/xmlns}edge",
"data": "{http://graphml.graphdrawing.org/xmlns}data",
"weight": "{http://graphml.graphdrawing.org/xmlns}data[@key='weight']",
"edgeid": "{http://graphml.graphdrawing.org/xmlns}data[@key='edgeid']"
}
for event, elem in et.iterparse("/data/sample.graphml",tag=graphml.get("edge"), events = ('end', )):
print(et.tostring(elem))
elem.clear()
while elem.getprevious() is not None:
del elem.getparent()[0]
</code></pre>