使用Python minidom读取XML并遍历每个节点

<root> <conference name='1'> <author> Bob </author> <author> Nigel </author> </conference> <conference name='2'> <author> Alice </author> <author> Mary </author> </conference> </root>

dom = parse(filepath) conference=dom.getElementsByTagName('conference') for node in conference: conf_name=node.getAttribute('name') print conf_name alist=node.getElementsByTagName('author') for a in alist: authortext= a.nodeValue print authortext

3条回答

网友

1楼 · 编辑于 2024-05-20 21:00:39

元素节点没有nodeValue。你得看看里面的文本节点。如果知道里面总是有一个文本节点，可以说element.firstChild.data（数据与文本节点的nodeValue相同）。

注意：如果没有文本内容，则没有子文本节点，element.firstChild将为空，从而导致.data访问失败。

获取直接子文本节点内容的快速方法：

text= ''.join(child.data for child in element.childNodes if child.nodeType==child.TEXT_NODE)

在DOM Level 3核心中，您获得了textContent属性，可以使用该属性递归地从元素内部获取文本，但是minidom不支持这个属性（其他一些Python DOM实现也支持）。

网友

2楼 · 编辑于 2024-05-20 21:00:39

您的authortext属于类型1（ELEMENT_NODE），通常需要有TEXT_NODE才能获取字符串。这样就行了

a.childNodes[0].nodeValue

网友

3楼 · 编辑于 2024-05-20 21:00:39

快速访问：

node.getElementsByTagName('author')[0].childNodes[0].nodeValue

相关问题更多 >

编程相关推荐

热门问题

热门文章