解析大型xml文件时内存已满和其他问题

2024-09-27 00:14:12 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个XML跟踪文件,该文件的大小约为350MB。当我使用下面的代码时,一次它产生内存满问题,另一次它产生关于无法解析文件的错误。我应该如何解析这个巨大的文件?是否使用不同的方法进行解析

    root = ET.parse('E:/software/jm_16.1/bin/tracefile.xml').getroot()
    lst = root.findall('AVCTrace/Picture/SubPicture/Slice/MacroBlock')
    for item in lst:
        print (item.get('QP_Y'))

I also produce a smaller file and based on the above file and the variable `lst` is empty!!. do you know what is the problem?
my XML trace file is as follows:

    I also need to extract X tag and Y tag in Macroblock. for this I used <MacroBlock num="8158">
                <SubMacroBlock num="0">
                    <Type>1</Type>
                    <TypeString>B_L0_8x8</TypeString>
                    <MotionVector list="0">
                        <RefIdx>0</RefIdx>
                        <Difference>
                            <X>-1</X>
                            <Y>-2</Y>
                        </Difference>
                        <Absolute>
                            <X>-4</X>
                            <Y>-6</Y>
                        </Absolute>
                    </MotionVector>
                </SubMacroBlock>

Tags: and文件theinforistagroot
1条回答
网友
1楼 · 发布于 2024-09-27 00:14:12
import xml.etree.ElementTree as ET


root = ET.parse('68071609.xml').getroot()
print(root.tag)  # Picture
elems = root.findall('SubPicture/Slice/MacroBlock/QP_Y')
for elem in elems:
    print(elem.text)  # 28

树的根已经是Picture,因此不应在其中搜索Picture/...
通过将名为QP_Y的节点添加到搜索路径,可以直接搜索所有节点

如果您更喜欢在宏块上迭代,并使其QP_Y:

elems = root.findall('SubPicture/Slice/MacroBlock')
for elem in elems:
    print(elem.attrib)  # {'num': '0'}
    qp_y = next(child for child in elem if child.tag == "QP_Y").text  # will throw StopIteration if missing
    print(qp_y)  # 28

相关问题 更多 >

    热门问题