擅长:python、mysql、java
<p>Python通常在内部将unicode值存储为UCS2。UTF-32\U00010302字符的UTF-16表示是\UD800\UDF02,这就是您得到这个结果的原因。在</p>
<p>也就是说,有些python构建使用UCS4,但是这些构建彼此不兼容。在</p>
<p>看看<a href="http://docs.python.org/c-api/unicode.html" rel="nofollow">here</a>。在</p>
<blockquote>
<p>Py_UNICODE
This type represents the storage type which is used by Python internally as basis for holding Unicode ordinals. Python’s default builds use a 16-bit type for Py_UNICODE and store Unicode values internally as UCS2. It is also possible to build a UCS4 version of Python (most recent Linux distributions come with UCS4 builds of Python). These builds then use a 32-bit type for Py_UNICODE and store Unicode data internally as UCS4. On platforms where wchar_t is available and compatible with the chosen Python Unicode build variant, Py_UNICODE is a typedef alias for wchar_t to enhance native platform compatibility. On all other platforms, Py_UNICODE is a typedef alias for either unsigned short (UCS2) or unsigned long (UCS4).</p>
</blockquote>