为什么我得到SyntaxError:(unicode错误)“utf-8”编解码器无法解码位置0中的字节0x96:起始位置无效

2024-09-28 21:22:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我从一个API获得了一些json数据。我使用json.loads,然后将其打印到REPL,如下所示。

  {'warnings': {'query': {'*': "Formatting of continuation data will be changing soon. To continue using the current formatting, use the 'rawcontinue' parameter. To begin using the new format, pass an empty string for 'continue' in the initial query."}}, 'query-continue': {'links': {'plcontinue': '25618423|10|R_from_other_capitalisation', 'gplcontinue': "15095968|0|1991_US_Open_-_Women's_Doubles"}}, 'query': {'pages': {'32203010': {'pageid': 32203010, 'title': "1988 Australian Open - Women's Doubles", 'ns': 0}, '25618558': {'pageid': 25618558, 'title': "1984 Wimbledon Championships - Women's Singles", 'ns': 0}, '29486043': {'pageid': 29486043, 'title': "1984 Wimbledon Championships - Women's Doubles", 'ns': 0}, '25618819': {'pageid': 25618819, 'title': "1986 US Open - Women's Singles", 'ns': 0}, '25619314': {'pageid': 25619314, 'title': "1989 US Open - Women's Singles", 'ns': 0}, '25618668': {'pageid': 25618668, 'title': "1985 US Open - Women's Singles", 'ns': 0}, '25618857': {'pageid': 25618857, 'title': "1987 Australian Open - Women's Singles", 'ns': 0}, '25618423': {'links': [{'title': "1983 Wimbledon Championships – Women's Singles", 'ns': 0}, {'title': 'Wikipedia:Mainspace', 'ns': 4}, {'title': 'Template:R from long name', 'ns': 10}], 'pageid': 25618423, 'title': "1983 Wimbledon Championships - Women's Singles", 'ns': 0}, '23826062': {'links': [{'title': "1984 French Open – Women's Singles", 'ns': 0}, {'title': 'Wikipedia:Mainspace', 'ns': 4}, {'title': 'Template:R from long name', 'ns': 10}, {'title': 'Template:R from other capitalisation', 'ns': 10}, {'title': 'Template:R from plural', 'ns': 10}, {'title': 'Template:R from short name', 'ns': 10}, {'title': 'Category:Redirects from modifications', 'ns': 14}], 'pageid': 23826062, 'title': "1984 French Open - Women's Singles", 'ns': 0}, '25619177': {'pageid': 25619177, 'title': "1989 Australian Open - Women's Singles", 'ns': 0}}}}

然后我将数据从repl复制到一个.py模块,并分配给一个变量,这样我就可以执行一些单元测试。但我一直有个错误:

SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte

怎么回事?

更新:我得到错误的确切方式。使用Visual Studio,我运行了一个脚本,该脚本使用请求和.text获取数据以获取内容。然后我应用json.loads。我将其打印到了visualstudiopython3.4interactive(又名REPL)。然后我用鼠标从这个REPL复制并粘贴到Visual Studio中的一个.py文件中。

更新2:所以当我获取数据时,我使用请求,然后使用文本属性。当我打印时没有json.loads它的罚款。但是,如果我从REPL复制这个“more raw”,并粘贴它,不再是一个字符串,而是一个对象和JSON加载将不起作用。python 3是否打印对象,即使它应该是json?

这是原始的no json。使用Requests.text从API加载输出:

{"warnings":{"query":{"*":"Formatting of continuation data will be changing soon. To continue using the current formatting, use the 'rawcontinue' parameter. To begin using the new format, pass an empty string for 'continue' in the initial query."}},"query-continue":{"links":{"plcontinue":"25618423|10|R_from_other_capitalisation","gplcontinue":"15095968|0|1991_US_Open_-_Women's_Doubles"}},"query":{"pages":{"25618423":{"pageid":25618423,"ns":0,"title":"1983 Wimbledon Championships - Women's Singles","links":[{"ns":0,"title":"1983 Wimbledon Championships \u2013 Women's Singles"},{"ns":4,"title":"Wikipedia:Mainspace"},{"ns":10,"title":"Template:R from long name"}]},"23826062":{"pageid":23826062,"ns":0,"title":"1984 French Open - Women's Singles","links":[{"ns":0,"title":"1984 French Open \u2013 Women's Singles"},{"ns":4,"title":"Wikipedia:Mainspace"},{"ns":10,"title":"Template:R from long name"},{"ns":10,"title":"Template:R from other capitalisation"},{"ns":10,"title":"Template:R from plural"},{"ns":10,"title":"Template:R from short name"},{"ns":14,"title":"Category:Redirects from modifications"}]},"29486043":{"pageid":29486043,"ns":0,"title":"1984 Wimbledon Championships - Women's Doubles"},"25618558":{"pageid":25618558,"ns":0,"title":"1984 Wimbledon Championships - Women's Singles"},"25618668":{"pageid":25618668,"ns":0,"title":"1985 US Open - Women's Singles"},"25618819":{"pageid":25618819,"ns":0,"title":"1986 US Open - Women's Singles"},"25618857":{"pageid":25618857,"ns":0,"title":"1987 Australian Open - Women's Singles"},"32203010":{"pageid":32203010,"ns":0,"title":"1988 Australian Open - Women's Doubles"},"25619177":{"pageid":25619177,"ns":0,"title":"1989 Australian Open - Women's Singles"},"25619314":{"pageid":25619314,"ns":0,"title":"1989 US Open - Women's Singles"}}}}

Tags: thefromjsontitletemplateopenqueryus
1条回答
网友
1楼 · 发布于 2024-09-28 21:22:23

文本中有EN DASH(U+2013)字符。在Windows-1252编解码器中,它们映射到字节\x96。您有编码问题,但具体原因取决于将文本复制到.py文件所采取的步骤。我将问题中的文本剪切并粘贴到Notepad++中,编码设置为ANSI,并将其分配给一个变量,然后得到:

  File "C:\temp.py", line 1
SyntaxError: unknown decode error

但选择UTF-8UTF-8 without BOM作为编码时,它工作正常。如果没有声明源代码的#coding:注释,Python 3将采用UTF-8。

请注意,在我的美国Windows系统上,ANSI实际上是Windows-1252。使用ANSI和添加#coding:windows-1252也可以正常工作。如果源代码与默认代码不同(Python 2上的ascii,Python 3上的utf-8),Python需要知道源代码。

相关问题 更多 >