如何用attribu获取lxml中所有元素的路径

2024-09-28 21:09:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下代码:

tree = etree.ElementTree(new_xml)
for e in new_xml.iter():
    print tree.getpath(e), e.text

这将给我一些类似以下内容:

^{pr2}$

但是,我需要的不是列表元素的路径,而是属性的路径。以下是xml的外观:

<Item>
  <Purchases>
     <Purchase Country="US">
      <URL>http://tvgo.xfinity.com/watch/x/6091165US</URL>
      <Rating>R</Rating>
    </Purchase>
     <Purchase Country="CA">
      <URL>http://tvgo.xfinity.com/watch/x/6091165CA</URL>
      <Rating>R</Rating>
    </Purchase>
</Item>

我怎样才能得到下面的路径呢?在

/Item/Purchases 

/Item/Purchases/Purchase[@Country="US"]
/Item/Purchases/Purchase[@Country="US"]/URL http://tvgo.xfinity.com/watch/x/6091165185315991112/movies
/Item/Purchases/Purchase[@Country="US"]/Rating R

/Item/Purchases/Purchase[@Country="CA"]
/Item/Purchases/Purchase[@Country="CA"]/URL http://tvgo.xfinity.com/watch/x/6091165185315991112/movies
/Item/Purchases/Purchase[@Country="CA"]/Rating R

Tags: 路径comhttpurlxmlpurchaseitemcountry
1条回答
网友
1楼 · 发布于 2024-09-28 21:09:22

不漂亮,但它能起作用。在

replacements = {}

for e in tree.iter():
    path = tree.getpath(e)

    if re.search('/Purchase\[\d+\]$', path):
        new_predicate = '[@Country="' + e.attrib['Country'] + '"]'
        new_path = re.sub('\[\d+\]$', new_predicate, path)
        replacements[path] = new_path

    for key, replacement in replacements.iteritems():
        path = path.replace(key, replacement)

    print path, e.text.strip()

为我打印这个:

^{pr2}$

相关问题 更多 >