Python中解码以下URL

2024-10-02 18:20:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一个网址:

http://idebate.org/debatabase/debates/constitutional-governance/house-supports-dalai-lama%E2%80%99s-%E2%80%98third-way%E2%80%99-tibet

然后我在python中使用以下脚本来解码此url:

full_href = urllib.unquote(full_href.encode('ascii')).decode('utf-8')

然而,我得到了这样的错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 89: ordinal not in range(128)

尝试写入文件时


Tags: inorghttpasciifullencodehousehref
1条回答
网友
1楼 · 发布于 2024-10-02 18:20:04

就像@凯文杰。蔡斯指出,您很可能试图用不兼容的ascii格式的字符串写入文件。 您可以更改写入文件编码,也可以将full_href编码为ascii,如下所示:

# don't decode again to utf-8
full_href = urllib.unquote(url.encode('ascii'))
... then write to your file stream

或者

...
# encode your your to compatible encoding on write, ie. utf-8
with open('yourfilenamehere', 'w') as f:
    f.write(full_href.encode('utf-8'))

相关问题 更多 >