使用xml.etree.ElementTree解析某些元素时出现问题

<?xml version="1.0"?> <bugrepository name="AspectJ"> <bug id="28974" opendate="2003-1-3 10:28:00" fixdate="2003-1-14 14:30:00"> <buginformation> <summary>"Compiler error when introducing a ""final"" field"</summary> <description>The aspecs the problem...</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/AjcMemberMaker.java</file> </fixedFiles> </bug> <bug id="28919" opendate="2002-12-30 16:40:00" fixdate="2003-1-14 15:06:00"> <buginformation> <summary>waever tries to weave into native methods ...</summary> <description>If youat org.aspectj.ajdt.internal.core.burce</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/LazyMethodGen.java</file> </fixedFiles> </bug> <bug id="29186" opendate="2003-1-8 21:22:00" fixdate="2003-1-14 16:43:00"> <buginformation> <summary>ajc -emacssym chokes on pointcut that includes an intertype method</summary> <description>This ;void Foo.ajc$before$Foo</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Lint.java</file> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Shadow.java</file> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWeaver.java</file> </fixedFiles> </bug> <bug id="29769" opendate="2003-1-19 11:42:00" fixdate="2003-1-24 21:17:00"> <buginformation> <summary>Ajde does not support new AspectJ 1.1 compiler options</summary> <description>The org.aspectj.ajpiler. This enhancement is needed byort.</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/ajde/testdata/examples/figures-coverage/figures/Figure.java</file> <file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/AjdeTests.java</file> <file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/ui/StructureViewManagerTest.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/ajc/BuildArgParser.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/core/builder/AjBuildConfig.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/ajdt/ajc/BuildArgParserTestCase.java</file> </fixedFiles> </bug> <bug id="29959" opendate="2003-1-22 7:10:00" fixdate="2003-2-13 16:00:00"> <buginformation> <summary>super call in intertype method declaration body causes VerifyError</summary> <description>AspectJ Compiler 1.1 showstopper</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/compiler/ast/InterTypeConstructorDeclaration.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/ast/SuperFixerVisitor.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/lookup/InterTypeMethodBinding.java</file> <file>org.aspectj/modules/tests/bugs/SuperToIntro.java</file> </fixedFiles> </bug> </bugrepository>

import pandas as pd from xml.etree.ElementTree import parse document = parse('dataset.xml') summary = [] description = [] fixedfile = [] for item in document.iterfind('bug'): summary.append(item.findtext('buginformation/summary')) description.append(item.findtext('buginformation/description')) fixedfile.append(item.findall('fixedFiles/file')) #df = pd.DataFrame({'summary':summary, 'description':description, 'fixed_files':fixedfile}) df = pd.DataFrame({'fixed_files': fixedfile}) df

import pandas as pd from xml.etree.ElementTree import parse document = parse('dataset.xml') summary = [] description = [] fixedfile = [] for item in document.iterfind('bug'): summary.append(item.findtext('buginformation/summary')) description.append(item.findtext('buginformation/description')) fixedfile.append(item.findtext('fixedFiles/file')) #df = pd.DataFrame({'summary':summary, 'description':description, 'fixed_files':fixedfile}) df = pd.DataFrame({'fixed_files': fixedfile}) df

import xml.etree.ElementTree as ET import pandas as pd xmldoc = ET.parse('dataset.xml') root = xmldoc.getroot() summary = [] description = [] fixedfile = [] for bug in xmldoc.iter(tag='bug'): #for item in document.iterfind('bug'): #summary.append(item.findtext('buginformation/summary')) #description.append(item.findtext('buginformation/description')) for file in bug.iterfind('./fixedFiles/file'): fixedfile.append([file.text]) fixedfile #df = pd.DataFrame({'summary':summary, 'description':description, 'fixed_files':fixedfile}) df = pd.DataFrame({'fixed_files': fixedfile}) df

2条回答

网友
1楼 · 编辑于 2024-05-19 21:38:04

下表收集了数据。其思想是找到所有bug元素并对它们进行迭代。对于每个bug-查找所需的子元素
import xml.etree.ElementTree as ET import pandas as pd xml = '''<?xml version="1.0"?> <bugrepository name="AspectJ"> <bug id="28974" opendate="2003-1-3 10:28:00" fixdate="2003-1-14 14:30:00"> <buginformation> <summary>"Compiler error when introducing a ""final"" field"</summary> <description>The aspecs the problem...</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/AjcMemberMaker.java</file> </fixedFiles> </bug> <bug id="28919" opendate="2002-12-30 16:40:00" fixdate="2003-1-14 15:06:00"> <buginformation> <summary>waever tries to weave into native methods ...</summary> <description>If youat org.aspectj.ajdt.internal.core.burce</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/LazyMethodGen.java</file> </fixedFiles> </bug> <bug id="29186" opendate="2003-1-8 21:22:00" fixdate="2003-1-14 16:43:00"> <buginformation> <summary>ajc -emacssym chokes on pointcut that includes an intertype method</summary> <description>This ;void Foo.ajc$before$Foo</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Lint.java</file> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/Shadow.java</file> <file>org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWeaver.java</file> </fixedFiles> </bug> <bug id="29769" opendate="2003-1-19 11:42:00" fixdate="2003-1-24 21:17:00"> <buginformation> <summary>Ajde does not support new AspectJ 1.1 compiler options</summary> <description>The org.aspectj.ajpiler. This enhancement is needed byort.</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/ajde/testdata/examples/figures-coverage/figures/Figure.java</file> <file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/AjdeTests.java</file> <file>org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/ui/StructureViewManagerTest.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/ajc/BuildArgParser.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/core/builder/AjBuildConfig.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/ajdt/ajc/BuildArgParserTestCase.java</file> </fixedFiles> </bug> <bug id="29959" opendate="2003-1-22 7:10:00" fixdate="2003-2-13 16:00:00"> <buginformation> <summary>super call in intertype method declaration body causes VerifyError</summary> <description>AspectJ Compiler 1.1 showstopper</description> </buginformation> <fixedFiles> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/compiler/ast/InterTypeConstructorDeclaration.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/ast/SuperFixerVisitor.java</file> <file>org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/lookup/InterTypeMethodBinding.java</file> <file>org.aspectj/modules/tests/bugs/SuperToIntro.java</file> </fixedFiles> </bug> </bugrepository>''' data = [] root = ET.fromstring(xml) for bug in root.findall('.//bug'): bug_info = bug.find('buginformation') fixed_files = bug.find('fixedFiles') entry = {'summary': bug_info.find('summary').text,'description':bug_info.find('summary').text,'fixedFiles':[x.text for x in list(fixed_files)]} data.append(entry) for entry in data: print(entry) df = pd.DataFrame(data)
输出
{'summary': '"Compiler error when introducing a ""final"" field"', 'description': '"Compiler error when introducing a ""final"" field"', 'fixedFiles': ['org.aspectj/modules/weaver/src/org/aspectj/weaver/AjcMemberMaker.java']} {'summary': 'waever tries to weave into native methods ...', 'description': 'waever tries to weave into native methods ...', 'fixedFiles': ['org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/LazyMethodGen.java']} {'summary': 'ajc -emacssym chokes on pointcut that includes an intertype method', 'description': 'ajc -emacssym chokes on pointcut that includes an intertype method', 'fixedFiles': ['org.aspectj/modules/weaver/src/org/aspectj/weaver/Lint.java', 'org.aspectj/modules/weaver/src/org/aspectj/weaver/Shadow.java', 'org.aspectj/modules/weaver/src/org/aspectj/weaver/bcel/BcelWeaver.java']} {'summary': 'Ajde does not support new AspectJ 1.1 compiler options', 'description': 'Ajde does not support new AspectJ 1.1 compiler options', 'fixedFiles': ['org.aspectj/modules/ajde/testdata/examples/figures-coverage/figures/Figure.java', 'org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/AjdeTests.java', 'org.aspectj/modules/ajde/testsrc/org/aspectj/ajde/ui/StructureViewManagerTest.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/ajc/BuildArgParser.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/core/builder/AjBuildConfig.java', 'org.aspectj/modules/org.aspectj.ajdt.core/testsrc/org/aspectj/ajdt/ajc/BuildArgParserTestCase.java']} {'summary': 'super call in intertype method declaration body causes VerifyError', 'description': 'super call in intertype method declaration body causes VerifyError', 'fixedFiles': ['org.aspectj/modules/org.aspectj.ajdt.core/src/org/compiler/ast/InterTypeConstructorDeclaration.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/ast/SuperFixerVisitor.java', 'org.aspectj/modules/org.aspectj.ajdt.core/src/org/aspectj/ajdt/internal/compiler/lookup/InterTypeMethodBinding.java', 'org.aspectj/modules/tests/bugs/SuperToIntro.java']}

网友
2楼 · 编辑于 2024-05-19 21:38:04

要将文件保存在与描述和摘要关联的列表中，请将它们添加到每个bug的新列表中
试试看：
import pandas as pd from xml.etree.ElementTree import parse document = parse('dataset.xml') summary = [] description = [] fixedfile = [] for item in document.iterfind('bug'): summary.append(item.findtext('buginformation/summary')) description.append(item.findtext('buginformation/description')) fixedfile.append([elt.text for elt in item.findall('fixedFiles/file')]) df = pd.DataFrame({'summary': summary, 'description': description, 'fixed_files': fixedfile}) df
对于第二部分，这将只过滤带有两个或更多文件的bug
newdf = df[df.fixed_files.str.len() >= 2]
如果想要2个和3个文件的bug，那么：
newdf = df[(df.fixed_files.str.len() == 2) | (df.fixed_files.str.len() == 3)]

相关问题更多 >

编程相关推荐

热门问题

热门文章