处理后的XML文件的内容如下:
<dblp>
<incollection>
<author>Philippe Balbiani</author>
<author>Valentin Goranko</author>
<author>Ruaan Kellerman</author>
<author>Dimiter Vakarelov</author>
<booktitle>Handbook of Spatial Logics</booktitle>
</incollection>
<incollection>
<author>Jochen Renz</author>
<author>Bernhard Nebel</author>
<booktitle>Handbook of AI</booktitle>
</incollection>
...
</dblp>
格式内容如上所示,提取“author”标记内容和“booktitle”标记内容,它们都位于“incollection”标记中,遍历每个“incollection”标记,并让多个author标记内容形成一个“booktitle”标记内容。对应关系
我的代码:
soup = BeautifulSoup(str(getfile()), 'lxml')
res = soup.find_all('incollection')
list = []
list1=[]
for each in res:
for child in each.children:
if child.name == 'author':
list.append(child.text)
if child.name == 'booktitle':
list1.append(child.text)
elem_dic = tuple(zip(list, list1))
我的结果是:
('Philippe Balbiani', 'Handbook of Spatial Logics')
('Valentin Goranko', 'Handbook of Spatial Logics')
('Ruaan Kellerman', 'Handbook of Spatial Logics')
理想的结果如下:
('Philippe Balbiani', 'Handbook of Spatial Logics')
('Valentin Goranko', 'Handbook of Spatial Logics')
('Ruaan Kellerman', 'Handbook of Spatial Logics')
('Dimiter Vakarelov', 'Handbook of Spatial Logics')
('Jochen Renz', 'Handbook of AI')
('Bernhard Nebel', 'Handbook of AI')
如何修改它以达到预期的结果
修改了下面给出的代码
相关问题 更多 >
编程相关推荐