带有unicode代码的文本在Python 3.7中未正确显示

2条回答

网友

1楼 · 编辑于 2024-10-01 04:52:54

我们可以通过两个步骤来实现这一点：

首先，我们使用encoding='unicode_escape'读取文件以转换所有的\uxxxx

然后，我们通过将其透明地编码为bytes对象（使用latin-1编解码器）将其转换为utf-8，并再次将其转换为文本，解码为utf-8

with open('text.txt', encoding='unicode-escape') as f:
    text = f.read()
    print(text)
    #Edward escribiÃ³ la biografÃa de su autor favorito

    # Now we convert it to utf-8
    text = text.encode('latin1').decode('utf8')
    print(text)
    # Edward escribió la biografía de su autor favorito

网友

2楼 · 编辑于 2024-10-01 04:52:54

我相信输入是由于编码错误造成的

C3 93是ó（LATIN CAPITAL LETTER O WITH ACUTE）的UTF-8编码字节

在Python3控制台中运行

>>> text = "Edward escribi\u00c3\u00b3 la biograf\u00c3\u00ada de su autor favorito"
>>> text.encode('cp1252').decode('utf8')
'Edward escribió la biografía de su autor favorito'

相关问题更多 >

编程相关推荐

热门问题

热门文章

带有unicode代码的文本在Python 3.7中未正确显示

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >