Python ISO88591编码 - 问答 - Python中文网

Python ISO88591编码

2024-05-19 10:29:45 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

在处理ISO-8859-1/Latin-1字符集时，我在Python中面临一个巨大的编码问题。

当使用os.listdir获取文件夹的内容时，我得到的字符串编码为ISO-8859-1（例如：“Ol\xe1 Mundo”），但是在Python解释器中，相同的字符串编码为不同的字符集：

In : 'Olá Mundo'.decode('latin-1')
Out: u'Ol\xa0 Mundo'

如何强制Python将字符串解码为相同的格式？。我看到os.listdir返回正确编码的字符串，但解释器不是（“&”字符对应于ISO-8859-1中的“\xe1”，而不是“\xa0”）：

http://en.wikipedia.org/wiki/ISO/IEC_8859-1

有什么想法可以克服吗？

Tags：字符串 in 文件夹内容编码 os iso 解释器

1条回答

网友

1楼 · 发布于 2024-05-19 10:29:45

当您在python2交互会话中输入非unicode字符串文本时，将假定该文本为系统默认编码。

似乎您正在使用windows，因此默认编码可能是“cp850”或“cp437”：

C:\>python
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdin.encoding
'cp850'
>>> 'Olá Mundo'
'Ol\xa0 Mundo'
>>> u'Olá Mundo'.encode('cp850')
'Ol\xa0 Mundo'

如果将代码页更改为1252（大致相当于latin1），字符串将按预期显示：

C:\>chcp 1252
Active code page: 1252

C:\>python
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdin.encoding
'cp1252'
>>> 'Olá Mundo'
'Ol\xe1 Mundo'

相关问题更多 >

编程相关推荐

热门问题

热门文章