Python:ElementTree,获取Elemen的名称空间字符串

2024-09-24 02:21:16 发布

您现在位置:Python中文网/ 问答频道 /正文

此XML文件名为example.xml

<?xml version="1.0"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

  <modelVersion>14.0.0</modelVersion>
  <groupId>.com.foobar.flubber</groupId>
  <artifactId>uberportalconf</artifactId>
  <version>13-SNAPSHOT</version>
  <packaging>pom</packaging>
  <name>Environment for UberPortalConf</name>
  <description>This is the description</description>    
  <properties>
      <birduberportal.version>11</birduberportal.version>
      <promotiondevice.version>9</promotiondevice.version>
      <foobarportal.version>6</foobarportal.version>
      <eventuberdevice.version>2</eventuberdevice.version>
  </properties>
  <!-- A lot more here, but as it is irrelevant for the problem I have removed it -->
</project>

如果我加载example.xml并用ElementTree解析它,我可以看到它的名称空间是http://maven.apache.org/POM/4.0.0

>>> from xml.etree import ElementTree
>>> tree = ElementTree.parse('example.xml')
>>> print tree.getroot()
<Element '{http://maven.apache.org/POM/4.0.0}project' at 0x26ee0f0>

我还没有找到一个方法来调用,以便在不解析元素的str(an_element)的情况下,仅从Element获取名称空间。好像有更好的办法。


Tags: orgprojecthttpversionexampleapachedescriptionxml
3条回答

我不确定这在xml.etree中是否可行,但下面是如何在lxml.etree中实现的:

>>> from lxml import etree
>>> tree = etree.parse('example.xml')
>>> tree.xpath('namespace-uri(.)')
'http://maven.apache.org/POM/4.0.0'

名称空间应该在^{}中的“实际”标记之前:

>>> root = tree.getroot()
>>> root.tag
'{http://maven.apache.org/POM/4.0.0}project'

要了解有关名称空间的更多信息,请查看ElementTree: Working with Namespaces and Qualified Names

对于regular expression来说,这是一个完美的任务。

import re

def namespace(element):
    m = re.match(r'\{.*\}', element.tag)
    return m.group(0) if m else ''

相关问题 更多 >