解析XML并写入CSV fi

<?xml version="1.0" ?> <library owner="James Wise"> <book> <title>Sandman Volume 1: Preludes and Nocturnes</title> <author>Neil Gaiman</author> </book> <book> <title>Good Omens</title> <author>Neil Gamain</author> <author>Terry Pratchett</author> </book> <book> <title>The Man And The Goat</title> <author>Bubber Elderidge</author> </book> <book> <title>Once Upon A Time in LA</title> <author>Dr Dre</author> </book> <book> <title>There Will Never Be Justice</title> <author>IR Jury</author> </book> <book> <title>Beginning Python</title> <author>Peter Norton, et al</author> </book> </library>

title,author Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman Good Omens,Neil Gamain Good Omens,Terry Pratchett The Man And The Goat,Bubber Elderidge Once Upon A Time in LA,Dr Dre There Will Never Be Justice,IR Jury Beginning Python,"Peter Norton, et al"

title,author,author Sandman Volume 1: Preludes and Nocturnes,Neil Gaiman,, Good Omens,Neil Gamain,Terry Pratchett The Man And The Goat,Bubber Elderidge,, Once Upon A Time in LA,Dr Dre,, There Will Never Be Justice,IR Jury,, Beginning Python,"Peter Norton, et al",,

3条回答

网友

1楼 · 编辑于 2024-06-25 23:38:00

要让两个作者在同一行上，只需要一些基本的循环编程。对于每个标题，您需要遍历整个列表以在同一标题上搜索另一个作者。在

或者，首先按标题对列表进行排序，这样就可以在相邻的记录中找到两个作者。可以使用xml库调用直接对xml结构进行排序。在

网友

2楼 · 编辑于 2024-06-25 23:38:00

解决问题的好方法是使用lxml：

>>> with open('doc.xml') as f:
>>>     doc = etree.XML(f.read())
>>>     for e in doc.xpath('book'):
>>>         print (e.xpath('author/text()'), e.xpath('title/text()')[0])
(['Neil Gaiman'], 'Sandman Volume 1: Preludes and Nocturnes')
(['Neil Gamain', 'Terry Pratchett'], 'Good Omens')
(['Bubber Elderidge'], 'The Man And The Goat')
(['Dr Dre'], 'Once Upon A Time in LA')
(['IR Jury'], 'There Will Never Be Justice')
(['Peter Norton, et al'], 'Beginning Python')

然后，要生成csv，可以执行以下操作：

^{pr2}$

或者：

  with open('output.csv', 'w') as fout:
      fieldnames = ['title', 'author1', 'author2']
      writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
      writer.writeheader()
      for e in doc.xpath('book'):
         title, authors = e.xpath('author/text()'), e.xpath('title/text()')[0]
         author1, author2 = '', ''
         if len(authors) == 2:
             author2 = author[1]
         if len(authors) == 1:
             author1 = author[0]
         writer.writerow({'title': titleValue, 'author1': author1, 'author2': author2})

网友

3楼 · 编辑于 2024-06-25 23:38:00

还有一个可能的解决方案：

代码：

#! /usr/bin/python

from xml.dom.minidom import parse
import xml.dom.minidom
import csv

def writeToCSV(myLibrary):
    with open('output.csv', 'wb') as csvfile:
        writer = csv.writer(csvfile, delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
        writer.writerow(['title', 'author', 'author'])
        books = myLibrary.getElementsByTagName("book")
        for book in books:
            titleValue = book.getElementsByTagName("title")[0].childNodes[0].data
            authors = [] # get all the authors in a vector
            for author in book.getElementsByTagName("author"):
                authors.append(author.childNodes[0].data)
            writer.writerow([titleValue] + authors) # write to csv

doc = parse('library.xml')
myLibrary = doc.getElementsByTagName("library")[0]
# Print each book's title
writeToCSV(myLibrary)

输出：

^{pr2}$

谨致问候

相关问题更多 >

编程相关推荐

热门问题

热门文章