<p>这里有一种解决xml.dom问题的方法,并提供一种映射节点同时具有内容和属性或子节点的不明确情况的方法。对于上面的示例输入,它产生:</p>
<pre><code>$ python3 yamlout.py person.xml
---
person:
firstName: John
lastName: Smith
age: 25
address:
streetAddress: 21 2nd Street
city: New York
state: NY
postalCode: 10021
phoneNumbers:
phoneNumber:
_xml_node_content: 212 555-1234
type: home # Attribute
phoneNumber:
_xml_node_content: 646 555-4567
type: fax # Attribute
gender:
type: male
</code></pre>
<p>实现,yamlout.py:</p>
<pre><code>import sys
import xml.etree.ElementTree as ET
if len(sys.argv) != 2:
sys.stderr.write("Usage: {0} <file>.xml".format(sys.argv[0]))
XML_NODE_CONTENT = '_xml_node_content'
ATTR_COMMENT = '# Attribute'
def yamlout(node, depth=0):
if not depth:
sys.stdout.write('---\n')
# Nodes with both content AND nested nodes or attributes
# have no valid yaml mapping. Add 'content' node for that case
nodeattrs = node.attrib
children = list(node)
content = node.text.strip() if node.text else ''
if content:
if not (nodeattrs or children):
# Write as just a name value, nothing else nested
sys.stdout.write(
'{indent}{tag}: {text}\n'.format(
indent=depth*' ', tag=node.tag, text=content or ''))
return
else:
nodeattrs[XML_NODE_CONTENT] = node.text
sys.stdout.write('{indent}{tag}:\n'.format(
indent=depth*' ', tag=node.tag))
# Indicate difference node attributes and nested nodes
depth += 1
for n,v in nodeattrs.items():
sys.stdout.write(
'{indent}{n}: {v} {c}\n'.format(
indent=depth*' ', n=n, v=v,
c=ATTR_COMMENT if n!=XML_NODE_CONTENT else ''))
# Write nested nodes
for child in children:
yamlout(child, depth)
with open(sys.argv[1]) as xmlf:
tree = ET.parse(xmlf)
yamlout(tree.getroot())
</code></pre>