The csv module doesn’t directly support reading and writing Unicode, but it is 8-bit-clean save for some problems with ASCII NUL characters. So you can write functions or classes that handle the encoding and decoding for you as long as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.
import csv
def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
# csv.py doesn't do Unicode; encode temporarily as UTF-8:
csv_reader = csv.DictReader(utf_8_encoder(unicode_csv_data),
dialect=dialect, **kwargs)
for row in csv_reader:
# decode UTF-8 back to Unicode, cell by cell:
yield [unicode(cell, 'utf-8') for cell in row]
def UnicodeDictReader(utf8_data, **kwargs):
csv_reader = csv.DictReader(utf8_data, **kwargs)
for row in csv_reader:
yield {unicode(key, 'utf-8'):unicode(value, 'utf-8') for key, value in row.iteritems()}
首先,使用2.6 version of the documentation。每次发布都可以更改。它明确表示不支持Unicode,但支持UTF-8。Technically,这不是一回事。正如医生所说:
下面的示例(来自文档)展示了如何创建两个函数,将文本正确地读取为UTF-8作为CSV。您应该知道
csv.reader()
总是返回一个DictReader对象。对我来说,关键不在于操作csv DictReader args,而在于文件打开器本身。这就成功了:
不需要特殊等级。现在我可以打开文件或没有BOM,而不会崩溃。
我自己想出了一个答案:
注意:这已更新,因此根据注释中的建议对密钥进行解码
相关问题 更多 >
编程相关推荐