用Python解析XML时处理多个节点

<pub> <ID>7</ID> <title>On the Correlation of Image Size to System Accuracy in Automatic Fingerprint Identification Systems</title> <year>2003</year> <booktitle>AVBPA</booktitle> <pages>895-902</pages> <authors> <author>J. K. Schneider</author> <author>C. E. Richardson</author> <author>F. W. Kiefer</author> <author>Venu Govindaraju</author> </authors> </pub>

from xml.dom import minidom xmldoc= minidom.parse("test.xml") pub = xmldoc.getElementsByTagName("pub")[0] ID = pub.getElementsByTagName("ID")[0].firstChild.data title = pub.getElementsByTagName("title")[0].firstChild.data year = pub.getElementsByTagName("year")[0].firstChild.data booktitle = pub.getElementsByTagName("booktitle")[0].firstChild.data pages = pub.getElementsByTagName("pages")[0].firstChild.data authors = pub.getElementsByTagName("authors")[0] author = authors.getElementsByTagName("author")[0].firstChild.data num_authors = len(author) print("Number of authors: ", num_authors ) print(ID) print(title) print(year) print(booktitle) print(pages) print(author)

1条回答

网友

1楼 · 发布于 2024-06-28 15:10:23

请注意，这里得到的是第一作者的字符数，因为代码将结果限制为仅第一作者（索引0），然后得到其长度：

author = authors.getElementsByTagName("author")[0].firstChild.data
num_authors = len(author)
print("Number of authors: ", num_authors )

只是不要将结果限制为所有作者：

author = authors.getElementsByTagName("author")
num_authors = len(author)
print("Number of authors: ", num_authors )

您可以使用列表理解获取列表中的所有作者姓名，而不是作者元素：

author = [a.firstChild.data for a in authors.getElementsByTagName("author")]
print(author)
# [u'J. K. Schneider', u'C. E. Richardson', u'F. W. Kiefer', u'Venu Govindaraju']

相关问题更多 >

编程相关推荐

热门问题

热门文章