在使用ElementTree的XML文件末尾未找到任何元素

2024-05-06 16:27:33 发布

您现在位置:Python中文网/ 问答频道 /正文

这里是Python新手。我只是在练习从XML文件中提取某些元素。我正在处理这个Datacamp tutorial,并尝试解析教程开始时提供的“movies”XML文件

看起来是这样的:

<?xml version="1.0"?>
<collection>
    <genre category="Action">
        <decade years="1980s">
            <movie favorite="True" title="Indiana Jones: The raiders of the lost Ark">
                <format multiple="No">DVD</format>
                <year>1981</year>
                <rating>PG</rating>
                <description>
                'Archaeologist and adventurer Indiana Jones 
                is hired by the U.S. government to find the Ark of the 
                Covenant before the Nazis.'
                </description>
            </movie>
               <movie favorite="True" title="THE KARATE KID">
               <format multiple="Yes">DVD,Online</format>
               <year>1984</year>
               <rating>PG</rating>
               <description>None provided.</description>
            </movie>
            <movie favorite="False" title="Back 2 the Future">
               <format multiple="False">Blu-ray</format>
               <year>1985</year>
               <rating>PG</rating>
               <description>Marty McFly</description>
            </movie>
        </decade>
        <decade years="1990s">
            <movie favorite="False" title="X-Men">
               <format multiple="Yes">dvd, digital</format>
               <year>2000</year>
               <rating>PG-13</rating>
               <description>Two mutants come to a private academy for their kind whose resident superhero team must 
               oppose a terrorist organization with similar powers.</description>
            </movie>
            <movie favorite="True" title="Batman Returns">
               <format multiple="No">VHS</format>
               <year>1992</year>
               <rating>PG13</rating>
               <description>NA.</description>
            </movie>
               <movie favorite="False" title="Reservoir Dogs">
               <format multiple="No">Online</format>
               <year>1992</year>
               <rating>R</rating>
               <description>WhAtEvER I Want!!!?!</description>
            </movie>
        </decade>    
    </genre>

    <genre category="Thriller">
        <decade years="1970s">
            <movie favorite="False" title="ALIEN">
                <format multiple="Yes">DVD</format>
                <year>1979</year>
                <rating>R</rating>
                <description>"""""""""</description>
            </movie>
        </decade>
        <decade years="1980s">
            <movie favorite="True" title="Ferris Bueller's Day Off">
                <format multiple="No">DVD</format>
                <year>1986</year>
                <rating>PG13</rating>
                <description>Funny movie about a funny guy</description>
            </movie>
            <movie favorite="FALSE" title="American Psycho">
                <format multiple="No">blue-ray</format>
                <year>2000</year>
                <rating>Unrated</rating>
                <description>psychopathic Bateman</description>
            </movie>
        </decade>
    </genre>

我明白我必须补充一点 </collection>以解析文件。但是,我仍然被抛出错误: xml.etree.ElementTree.ParseError:未找到元素:第74行第12列

运行代码时

import os
import pandas as pd
import xml.etree.ElementTree as ET
tree = ET.parse('movies.xml')

最后一个“流派”元素是否关闭错误?我不确定为什么我无法解析该文件。感谢您的帮助-谢谢


Tags: 文件thenofalseformattitledescriptionxml
1条回答
网友
1楼 · 发布于 2024-05-06 16:27:33

导入包之后和解析XML文件之前,不必要的os change directory语句

引发错误的代码:

import os
import pandas as pd
import xml.etree.ElementTree as ET
os.chdir('filepath')
tree = ET.parse('movies.xml')
root = tree.getroot()

一旦删除了os.chdir语句,就可以对其进行正确的分析

相关问题 更多 >