向lxm中的几个子元素添加相同的元素

2024-09-30 14:25:41 发布

您现在位置:Python中文网/ 问答频道 /正文

背景:我试图用从web服务端点检索的元数据充实XML报告。报告列出了文本模块和图形,每个图形都有几个分辨率。我无法为每个分辨率添加元数据。在

问题:问题简化了。在

from lxml import etree as ET

myxml = """\
<report>
    <object id="foo">
        <reportitems>
            <reportitem id="1"/>
            <reportitem id="2"/>
            <reportitem id="3"/>
        </reportitems>
    </object>
</report>
"""

report = ET.fromstring(myxml)
test = ET.Element("test", foo="bar")

for r in report.findall("object/reportitems/reportitem"):
    r.append(test)

我得到这个输出:

^{pr2}$

现在,如果我像这样修改代码(使用相同的XML片段):

report = ET.fromstring(myxml)
myElements = [ET.Element("test1"), ET.Element("test2"), ET.Element("test3")]

counter = 0
for r in report.findall("object/reportitems/reportitem"):
    r.append(myElements[counter])
    counter += 1

…然后我得到这个输出:

<report>
    <object id="foo">
        <reportitems>
            <reportitem id="1"><test1/></reportitem>
            <reportitem id="2"><test2/></reportitem>
            <reportitem id="3"><test3/></reportitem>
        </reportitems>
    </object>
</report>

为什么不能将相同(相同)的元素作为子元素添加到我迭代的多个元素中?在


Tags: 数据testreportid元素objectfoo报告
2条回答

此行为在lxml tutorial中描述:

There is another important case where the behaviour of Elements in lxml (in 2.0 and later) deviates from that of lists and from that of the original ElementTree (prior to version 1.3 or Python 2.7/3.2):

>>> for child in root:
...     print(child.tag)
child0 child1 child2 child3
>>> root[0] = root[-1]  # this moves the element in lxml.etree!
>>> for child in root:
...     print(child.tag)
child3 child1 child2

In this example, the last element is moved to a different position, instead of being copied, i.e. it is automatically removed from its previous position when it is put in a different place. In lists, objects can appear in multiple positions at the same time, and the above assignment would just copy the item reference into the first position, so that both contain the exact same item:

^{pr2}$

Note that in the original ElementTree, a single Element object can sit in any number of places in any number of trees, which allows for the same copy operation as with lists. The obvious drawback is that modifications to such an Element will apply to all places where it appears in a tree, which may or may not be intended. The upside of this difference is that an Element in lxml.etree always has exactly one parent, which can be queried through the getparent() method. This is not supported in the original ElementTree.

>>> root is root[0].getparent()  # lxml.etree only!
True

If you want to copy an element to a different position in lxml.etree, consider creating an independent deep copy using the copy module from Python's standard library:

>>> from copy import deepcopy

>>> element = etree.Element("neu")
>>> element.append( deepcopy(root[1]) )

>>> print(element[0].tag)
child1
>>> print([ c.tag for c in root ])
['child3', 'child1', 'child2']

构造函数ET.Element中的问题,每次调用只创建一个节点。您可以更改节点的父节点,但ET.Element将只有一个。您可以在循环中多次创建ET.Element,以避免此问题:

for r in report.findall("object/reportitems/reportitem"):
    node = ET.Element("test", foo="bar")
    r.append(node)

相关问题 更多 >