使用python中的XML解析问题xml.etree.ElementT

<?xml version="1.0" encoding="UTF-8"?> <Response rid="1000" status="succeeded" moreData="false"> <Results completed="true" total="25" matched="5" processed="25"> <Resource type="h" DisplayName="Host" name="tango"> <Time start="2011/12/16/18/46/00" end="2011/12/16/19/46/00"/> <PerfData attrId="cpuUsage" attrName="Usage"> <Data intr="5" start="2011/12/16/19" end="2011/12/16/19" data="36.00"/> <Data intr="5" start="2011/12/16/19" end="2011/12/16/19" data="86.00"/> <Data intr="5" start="2011/12/16/19" end="2011/12/16/19" data="29.00"/> </PerfData> <Resource type="vm" DisplayName="VM" name="charlie" baseHost="tango"> <Time start="2011/12/16/18/46/00" end="2011/12/16/19/46/00"/> <PerfData attrId="cpuUsage" attrName="Usage"> <Data intr="5" start="2011/12/16/19" end="2011/12/16/19" data="6.00"/> </PerfData> </Resource> </Resource> </Result> </Response>

pattern = re.compile(r'(<Response.*?</Response>)', re.VERBOSE | re.MULTILINE) for match in pattern.finditer(data): contents = match.group(1) responses = xml.fromstring(contents) for results in responses: result = results.tag for resources in results: resource = resources.tag temp = {} temp = resources.attrib print temp

1条回答

网友

1楼 · 发布于 2024-09-22 16:38:59

不要用正则表达式解析xml！这行不通，请改用一些xml解析库，例如lxml：

编辑：代码示例现在只获取顶级资源，循环它们并尝试获取“子资源”，这是在注释中的OP请求之后进行的

from lxml import etree

content = '''
YOUR XML HERE
'''

root = etree.fromstring(content)

# search for all "top level" resources
resources = root.xpath("//Resource[not(ancestor::Resource)]")
for resource in resources:
    # copy resource attributes in a dict
    mashup = dict(resource.attrib)
    # find child resource elements
    subresources = resource.xpath("./Resource")
    # if we find only one resource, add it to the mashup
    if len(subresources) == 1:
        mashup['resource'] = dict(subresources[0].attrib)
    # else... not idea what the OP wants...

    print mashup

将输出：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章