擅长:python、mysql、java
<p>我会这样做:使用<a href="https://docs.python.org/3.3/library/collections.html#collections.defaultdict" rel="nofollow noreferrer">^{<cd1>}</a>来收集每个<code>id_user</code>值的节点。然后,对生成的字典进行后期处理,将副本写入单独的文件。使用<code>lxml.etree</code>:</p>
<pre><code>from collections import defaultdict
from lxml import etree
tree = etree.parse("input.xml")
facturics = defaultdict(list)
for node in tree.xpath(".//facturic"):
facturics[node.attrib["id_user"]].append(node)
for user_id, nodes in facturics.items():
if len(nodes) > 1: # save duplicates
with open("{user_id}.xml".format(user_id=user_id), "w") as output_file:
root = etree.Element("ROOT")
for node in nodes:
root.append(node)
etree.ElementTree(root).write(output_file, pretty_print=True)
</code></pre>
<p>运行此代码后,将在当前目录中生成一个名为<code>18446195.xml</code>的新文件,其中包含以下内容:</p>
^{pr2}$