Python ASCII编解码器无法在写入CSV期间对字符错误进行编码

import csv from BeautifulSoup import BeautifulSoup url = \ 'https://dummysite' response = requests.get(url) html = response.content soup = BeautifulSoup(html) table = soup.find('table', {'class': 'table'}) list_of_rows = [] for row in table.findAll('tr')[1:]: list_of_cells = [] for cell in row.findAll('td'): text = cell.text.replace('[','').replace(']','') list_of_cells.append(text) list_of_rows.append(list_of_cells) outfile = open("./test.csv", "wb") writer = csv.writer(outfile) writer.writerow(["Name", "Location"]) writer.writerows(list_of_rows)

2条回答

网友

1楼 · 编辑于 2024-09-28 21:21:26

除了Alastair的优秀建议之外，我发现最简单的选择是使用python3而不是python 2。我的脚本所需要的只是将open语句中的wb更改为accordance with Python3's syntax中的w。

网友

2楼 · 编辑于 2024-09-28 21:21:26

Python 2.x CSV库已损坏。你有三个选择。按复杂程度排序：

编辑：请参见下面的~~使用固定库https://github.com/jdunck/python-unicodecsv（pip install unicodecsv）。作为替换品使用-示例：~~
```
with open("myfile.csv", 'rb') as my_file:    
    r = unicodecsv.DictReader(my_file, encoding='utf-8')
```

阅读有关Unicode的CSV手册：https://docs.python.org/2/library/csv.html（请参阅底部的示例）

手动将每个项目编码为UTF-8：

for cell in row.findAll('td'):
    text = cell.text.replace('[','').replace(']','')
    list_of_cells.append(text.encode("utf-8"))

编辑，我发现python unicodecsv在读取UTF-16时也断了。它抱怨任何0x00字节。

相反，使用https://github.com/ryanhiebert/backports.csv，它更类似于Python 3实现，使用io模块。。

安装：

pip install backports.csv

用法：

from backports import csv
import io

with io.open(filename, encoding='utf-8') as f:
    r = csv.reader(f):

相关问题更多 >

编程相关推荐

热门问题

热门文章