如何计算xml文件中包含“特定文本”的标记，而不考虑层次结构？

2条回答

网友

1楼 · 编辑于 2024-10-01 13:27:32

如果您希望在标准库中执行此操作（即不使用lxml依赖项），可以尝试以下操作（假设您的xml文件是sample.xml）：

from xml.etree import ElementTree as ET

xml = ET.parse('sample.xml')
count = 0
for e in xml.findall(".//sub[context]"):
    if e.find("context").text in ('aligned', 'not-aligned'):
        count += 1
print(count)

编辑：如果我正确理解您对我的答案的评论，您永远不会想同时计算“未对齐”和“对齐”两个值，而是只计算两个值中的任何一个。而且，您实际上并不关心context出现在哪个元素下面。那样的话

^{pr2}$

应该给你你想要的。在

网友

2楼 · 编辑于 2024-10-01 13:27:32

xml = '''<xml>
  <t1>fdhdhd</t1>
  <t2>fdhdhd</t2>
  <sub>
      <context>aligned</context>
  </sub>
 <context>not-aligned</context>
    <sub>
      <context>aligned</context>
  </sub>
</xml>'''

from lxml import etree

tree = etree.fromstring(xml)
tree.xpath('count(//sub/context[.="aligned" or .="not-aligned"])')

输出：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何计算xml文件中包含“特定文本”的标记，而不考虑层次结构？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >