Python解析XML到CSV编码问题

2024-10-04 01:31:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我在一个文件夹中有大量的XML文件,我正在将其解析为CSV文件。我的代码如下所示:


    import xml.etree.ElementTree as ET
    import csv
    import os




    fields = [
        ('ID', 'FHRSID'),
        ('businessname', 'BusinessName'),
        ('businesstype', 'BusinessType'),
        ('address1', 'AddressLine1'),
        ('address2', 'AddressLine2'),
        ('address3', 'AddressLine3'),
        ('address4', 'AddressLine4'),
        ('postcode', 'PostCode'),
        ('longitude', 'Geocode/Longitude'),
        ('latitude', 'Geocode/Latitude')]

    path = '/***/****/****/XML'
    for filename in os.listdir(path):
        if not filename.endswith('.xml'): continue
        fullname = os.path.join(path, filename)
        tree = ET.parse(fullname)


    with open(r'outputdata.csv', 'wb') as f_businesslist:
        csv_businessdata = csv.DictWriter(f_businesslist, fieldnames=[field for field, match in fields])
        csv_businessdata.writeheader()

        for node in tree.iter('EstablishmentDetail'):
            row = {}

            for field_name, match in fields:
                try:
                    row[field_name] = node.find(match).text
                except AttributeError as e:
                    row[field_name] = ''



            csv_businessdata.writerow(row)

它做了它应该做的,但是我得到了这样一个编码错误:

Traceback (most recent call last):
  File "./XMLtoCsv.py", line 42, in <module>
    csv_businessdata.writerow(row)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 152, in writerow
    return self.writer.writerow(self._dict_to_list(rowdict))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 11: ordinal not in range(128)

有人能帮忙吗?我花了很多时间阅读一些类似的问题,但似乎没有什么帮助。我对这个很陌生,所以我认为这是我做过或没做过的蠢事。非常感谢


Tags: csvpathnameinimportfieldfieldsfor