UnicodeEncodeError withcsv.wri文件

2024-09-27 02:27:04 发布

您现在位置:Python中文网/ 问答频道 /正文

请原谅我丑陋的新密码,我正在学习。我从omdbapi中提取电影数据,但是当我把它移到CSV时,我得到了许多电影的UnicodeEncodeError。可能是因为演员的名字有口音。我想1.)确定哪些电影有问题,2.)跳过它们,和/或3.)最好纠正错误。我目前所做的只是在错误发生时传递整个过程。寻找一个简单的解决办法,因为我是新手。在

import csv
import os
import json
import omdb

movie_list = ['A Good Year', 'A Room with a View', 'Anchorman', 'Amélie', 'Annie Hall', 'Before Sunrise']

data_list = []

textdoc = open('textdoc.txt','w')

for w in movie_list:
    x = omdb.request(t=w, fullplot=True, tomatoes=True, r='json')
    y = x.content
    z = json.loads(y)
    data_list.append([z["Title"], z["Year"], z["Actors"], z["Awards"], z["Director"], z["Genre"], z["Metascore"], z["Plot"], z["Rated"], z["Runtime"], z["Writer"], z["imdbID"], z["imdbRating"], z["imdbVotes"], z["tomatoRating"], z["tomatoReviews"], z["tomatoFresh"], z["tomatoRotten"], z["tomatoConsensus"], z["tomatoUserMeter"], z["tomatoUserRating"], z["tomatoUserReviews"]])

try:
    with open('Films.csv', 'w') as g:
        a = csv.writer(g, delimiter=',')
        a.writerow(["Title", "Year", "Actors", "Awards", "Director", "Genre", "Metascore", "Plot", "Rated", "Runtime", "Writer", "imdbID", "imdbRating", "imdbVotes", "tomatoRating", "tomatoReviews", "tomatoFresh", "tomatoRotten", "tomatoConsensus", "tomatoUserMeter", "tomatoUserRating", "tomatoUserReviews"])
        a.writerows(data_list)
except UnicodeEncodeError:
    print("fail")

Tags: csvimportjsontruedata电影titlewith
3条回答

如果使用python2,csvwriter并不真正支持Unicode,但是csv文档中有一个例子可以解决这个问题。例如this answer。在

如果使用Python 3,请进行以下更改:

y = x.content.decode('utf8')

以及

^{pr2}$

通过这些更改,文本被解码为Unicode以便在Python脚本中进行处理,并在写入文件时编码回UTF-8。这是处理Unicode的推荐方法。在

newline=''是打开文件供csv使用的正确方法。请参见this answercsv文档。在

也可以删除try/except。它只是抑制有用的回溯。在

python2.x:您可以尝试使用codecs而不是with open("Films.csv", 'w') as g:,以便以UTF-8编码方式打开csv输出。在

import codecs
with codecs.open('Films.csv', 'w', encoding='UTF-8') as g:
# rest of code

Python 3.x:尝试使用UTF-8编码打开g

^{pr2}$

试一试smart_str

from django.utils.encoding import smart_str
data_list.append(map(smart_str, [z['element1'], z['element2']]))
a.write_row(map(smart_str, ["Title", "Year", "Actors", "Awards", "Director", "Genre", "Metascore", "Plot", "Rated", "Runtime", "Writer", "imdbID", "imdbRating", "imdbVotes", "tomatoRating", "tomatoReviews", "tomatoFresh", "tomatoRotten", "tomatoConsensus", "tomatoUserMeter", "tomatoUserRating", "tomatoUserReviews"]))
a.write_rows(data_list)

相关问题 更多 >

    热门问题