Python不可生产的unicodedecodede

ERROR:root:message Traceback (most recent call last): File "FooBar.py", line 402, in foo_bar bar = bar_constructor(bar_theme,bar_user,uuid) File "FooBar.py", line 187, in bar_constructor if(main(uuid)): File "FooBar.py", line 158, in main f.write(make_wordm(uuid=uuid)) File "/home/foo/FooBarGen.py", line 57, in make_wordm search="00000000-0000-0000-0000-000000000000", replace=uuid) File "/home/foo/FooBarGen.py", line 24, in zipinfo_contents_replace contents = fd.read().replace(search, replace) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2722: ordinal not in range(128) INFO:FooBar:None

3条回答

网友

1楼 · 编辑于 2024-10-01 04:57:53

更改此行：

with open(fname, 'r') as fd:

为此：

^{pr2}$

ascii编码可以处理0到127之间的字符代码。您的文件包含的字符代码0xc3超出了范围。您需要选择其他编解码器。在

网友

2楼 · 编辑于 2024-10-01 04:57:53

在过去，我经常遇到特殊字符的问题，我都是在读取时解码为Unicode，然后在写回文件时编码为utf-8。在

我希望这对你也有用。在

对于我的解决方案，我一直使用我在这个演示文稿中找到的东西 http://farmdev.com/talks/unicode/

所以我会用这个：

def to_unicode_or_bust(obj, encoding='utf-8'):
    if isinstance(obj, basestring):
        if not isinstance(obj, unicode):
            obj = unicode(obj, encoding)
    return obj

那么在你的代码上：

^{pr2}$

然后在写的时候把编码设置回utf-8。在

output_zip.writestr(entry, contents.encode('utf-8'))

我没有重复你的问题，所以这只是一个建议。希望有用

网友

3楼 · 编辑于 2024-10-01 04:57:53

问题是Unicode和字节字符串的混合。Python2“有益地”尝试从一个到另一个进行转换，但默认使用ascii编解码器。在

下面是一个例子：

>>> 'aeioü'.replace('a','b')  # all byte strings
'beio\xfc'
>>> 'aeioü'.replace(u'a','b') # one Unicode string and it converts...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 4: ordinal not in range(128)

你提到过从JSON读取UUID。JSON返回Unicode字符串。理想情况下，将所有文本文件解码为Unicode，以Unicode进行所有文本处理，并在写入存储时对文本文件进行编码。在“更大的框架”中，这可能是一个很大的移植工作，但本质上使用io.open和编码来读取文件并解码为Unicode:

^{pr2}$

请注意，encoding应该与您正在读取的文件的实际编码相匹配。这是你必须确定的。在

正如您在编辑中发现的那样，一个快捷方式是将UUID从JSON编码回字节字符串，但目标应该是使用Unicode来处理文本。在

python3在默认情况下通过将字符串设置为Unicode来清理这个过程，并删除与byte/Unicode字符串之间的隐式转换。在

相关问题更多 >

编程相关推荐

热门问题

热门文章