如何从JSON获取字符串对象而不是Unicode？问题的回答

如何从JSON获取字符串对象而不是Unicode？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<h3>具有<code>object_hook</code></h3>的解 <pre><code>import json def json_load_byteified(file_handle): return _byteify( json.load(file_handle, object_hook=_byteify), ignore_dicts=True ) def json_loads_byteified(json_text): return _byteify( json.loads(json_text, object_hook=_byteify), ignore_dicts=True ) def _byteify(data, ignore_dicts = False): # if this is a unicode string, return its string representation if isinstance(data, unicode): return data.encode('utf-8') # if this is a list of values, return list of byteified values if isinstance(data, list): return [ _byteify(item, ignore_dicts=True) for item in data ] # if this is a dictionary, return dictionary of byteified keys and values # but only if we haven't already byteified it if isinstance(data, dict) and not ignore_dicts: return { _byteify(key, ignore_dicts=True): _byteify(value, ignore_dicts=True) for key, value in data.iteritems() } # if it's anything else, return it in its original form return data </code></pre> 示例用法： <pre><code>>>> json_loads_byteified('{"Hello": "World"}') {'Hello': 'World'} >>> json_loads_byteified('"I am a top-level string"') 'I am a top-level string' >>> json_loads_byteified('7') 7 >>> json_loads_byteified('["I am inside a list"]') ['I am inside a list'] >>> json_loads_byteified('[[[[[[[["I am inside a big nest of lists"]]]]]]]]') [[[[[[[['I am inside a big nest of lists']]]]]]]] >>> json_loads_byteified('{"foo": "bar", "things": [7, {"qux": "baz", "moo": {"cow": ["milk"]}}]}') {'things': [7, {'qux': 'baz', 'moo': {'cow': ['milk']}}], 'foo': 'bar'} >>> json_load_byteified(open('somefile.json')) {'more json': 'from a file'}</code></pre> <h3>这个怎么用？我为什么要用它？</h3> <a href="https://stackoverflow.com/a/13105359/1709587">Mark Amery's function</a>比这些短而清晰，那么它们有什么意义呢？你为什么要用它们？ 纯粹用于性能。Mark的答案首先使用unicode字符串对JSON文本进行完全解码，然后通过整个解码值递归，将所有字符串转换为字节字符串。这有两个不良影响： <ul> <li>在内存中创建整个解码结构的副本</li> <li>如果您的JSON对象是真正的深度嵌套（500个级别或更多），那么您将达到Python的最大递归深度</li> </ul> 这个答案通过使用<code>json.load</code>和<code>json.loads</code>的<code>object_hook</code>参数缓解了这两个性能问题。来自<a href="https://docs.python.org/2/library/json.html#json.load" rel="noreferrer">the docs</a>： <blockquote> <code>object_hook</code> is an optional function that will be called with the result of any object literal decoded (a <code>dict</code>). The return value of object_hook will be used instead of the <code>dict</code>. This feature can be used to implement custom decoders </blockquote> 由于词典在其他词典中嵌套了许多深层次，当它们被解码时会传递给<code>object_hook</code>，因此我们可以在此时指定其中的任何字符串或列表，避免以后需要深层递归。 Mark的答案不适合作为<code>object_hook</code>使用，因为它递归到嵌套字典中。我们用<code>_byteify</code>的<code>ignore_dicts</code>参数防止这个答案中的递归，除了<code>object_hook</code>向byteify传递一个新的<code>dict</code>时的之外，这个参数始终传递给它。<code>ignore_dicts</code>标志告诉<code>_byteify</code>忽略<code>dict</code>，因为它们已经被指定了。 最后，我们的<code>json_load_byteified</code>和<code>json_loads_byteified</code>实现对<code>json.load</code>或<code>json.loads</code>返回的结果调用<code>_byteify</code>（使用<code>ignore_dicts=True</code>），以处理正在解码的JSON文本在顶层没有<code>dict</code>的情况。

如何从JSON获取字符串对象而不是Unicode？

1 个回答

相关Python问题