如何使用Python将带有cp1252字符的unicode字符串转换成UTF8？

1条回答

网友

1楼 · 发布于 2024-10-01 13:31:12

似乎您的字符串是用latin1（因为它是unicode类型）解码的

要将其转换回原来的字节，需要使用该编码（latin1）对其进行编码
然后为了得到文本（unicode），你必须使用正确的编解码器（cp1252）解码
最后，如果你想得到utf-8字节，你必须使用UTF-8编解码器对进行编码。在

代码：

>>> title = u'There\x92s thirty days in June'
>>> title.encode('latin1')
'There\x92s thirty days in June'
>>> title.encode('latin1').decode('cp1252')
u'There\u2019s thirty days in June'
>>> print(title.encode('latin1').decode('cp1252'))
There’s thirty days in June
>>> title.encode('latin1').decode('cp1252').encode('UTF-8')
'There\xe2\x80\x99s thirty days in June'
>>> print(title.encode('latin1').decode('cp1252').encode('UTF-8'))
There’s thirty days in June

取决于API是接受文本（unicode）还是bytes，3。可能没有必要。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何使用Python将带有cp1252字符的unicode字符串转换成UTF8？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >