为Pandas-Python生成XML-to-Dataframe

2024-10-01 17:31:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我想把一个XML文件转换成一个数据帧,这样我就可以通过Python笔记本上的Pandas进行数据分析。然而,我在这个网站上找到的所有解决方案都给了我错误。在

以下是XML的简短版本:

    <EML xmlns="urn:oasis:names:tc:evs:schema:eml" xmlns:ns2="urn:oasis:names:tc:ciq:xsdschema:xAL:2.0" xmlns:ns3="urn:oasis:names:tc:ciq:xsdschema:xNL:2.0" xmlns:ns4="http://www.w3.org/2000/09/xmldsig#" xmlns:ns5="urn:oasis:names:tc:evs:schema:eml:ts" xmlns:ns6="http://www.kiesraad.nl/extensions" xmlns:ns7="http://www.kiesraad.nl/reportgenerator" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" Id="230b" SchemaVersion="5" xsi:schemaLocation="urn:oasis:names:tc:evs:schema:eml 230-candidatelist-v5-0.xsd http://www.kiesraad.nl/extensions kiesraad-eml-extensions.xsd">
<!--
Created by: Ondersteunende Software Verkiezingen by IVU Traffic Technologies AG, program: P2-3, version: 2.19.2
-->
<TransactionId>1</TransactionId>
<ManagingAuthority>
<AuthorityIdentifier Id="CSB">De Kiesraad</AuthorityIdentifier>
<AuthorityAddress/>
</ManagingAuthority>
<IssueDate>2017-02-13</IssueDate>
<ns6:CreationDateTime>2017-02-13T16:35:14.403+01:00</ns6:CreationDateTime>
<ns4:CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315#WithComments"/>
<CandidateList>
<Election>
<ElectionIdentifier Id="TK2017">
<ElectionName>Tweede Kamer der Staten-Generaal 2017</ElectionName>
<ElectionCategory>TK</ElectionCategory>
<ns6:ElectionSubcategory>TK</ns6:ElectionSubcategory>
<ns6:ElectionDate>2017-03-15</ns6:ElectionDate>
<ns6:NominationDate>2017-01-30</ns6:NominationDate>
</ElectionIdentifier>
<Contest>
<ContestIdentifier Id="9">
<ContestName>Amsterdam</ContestName>
</ContestIdentifier>
<Affiliation>
<AffiliationIdentifier Id="1">
<RegisteredName>VVD</RegisteredName>
</AffiliationIdentifier>
<Type>stel gelijkluidende lijsten</Type>
<ns6:ListData BelongsToSet="1" PublicationLanguage="nl" PublishGender="true"/>
<Candidate>
<CandidateIdentifier Id="1"/>
<CandidateFullName>
<ns3:PersonName>
<ns3:NameLine NameType="Initials">M.</ns3:NameLine>
<ns3:FirstName>Mark</ns3:FirstName>
<ns3:LastName>Rutte</ns3:LastName>
</ns3:PersonName>
</CandidateFullName>
<Gender>male</Gender>
<QualifyingAddress>
<ns2:Locality>
<ns2:LocalityName>'s-Gravenhage</ns2:LocalityName>
</ns2:Locality>
</QualifyingAddress>
</Candidate>
<Candidate>
<CandidateIdentifier Id="2"/>
<CandidateFullName>
<ns3:PersonName>
<ns3:NameLine NameType="Initials">J.A.</ns3:NameLine>
<ns3:FirstName>Jeanine</ns3:FirstName>
<ns3:LastName>Hennis-Plasschaert</ns3:LastName>
</ns3:PersonName>
</CandidateFullName>
<Gender>female</Gender>
<QualifyingAddress>
<ns2:Locality>
<ns2:LocalityName>Nederhorst den Berg</ns2:LocalityName>
</ns2:Locality>
</QualifyingAddress>
</Candidate>
</Affiliation>
</Contest>
</Election>
</CandidateList>
</EML>

我想通过Python命令来调整它,并希望在我的内核中执行它,然后我希望能够在这方面做我的Pandas。在

数据源: https://data.openstate.eu/dataset/kandidatenlijsten 这是我现在使用的代码: http://www.austintaylor.io/lxml/python/pandas/xml/dataframe/2016/07/08/convert-xml-to-pandas-dataframe/:`

^{pr2}$

这将产生错误:AssertionError: 3 columns passed, passed data had 2 columns

谢谢你,亲切的问候。在


Tags: idhttpnameswwwnlcandidatetceml

热门问题