转换字符串.解码（'utf8'）从Python2到Python3

2条回答

网友

1楼 · 编辑于 2024-10-04 01:36:18

返回与python2中相同的unicode是不可能的：我没有看到像python2和python3中那样的unicode对象。但是可以获得unicode对象的值。在

为此，您需要做几件事：
-创建值为'\xe5\xb8\x90\xe6\x88\xb7'的字节元素 -将此字节元素转换为字符串 -从字符串获取unicode代码

第一步很容易。要创建一个与c值相同的字节元素“c”，只需执行以下操作：

c = b'\xe5\xb8\x90\xe6\x88\xb7'

然后，读取元素

^{pr2}$

最后，我创建了一个函数来将字符串转换为其字符+unicode表示

def get_unicode_code(text):
    result = ""
    for char in text:
        ord_value = ord(char)
        if ord_value < 128:
            result += char
        else:
            hex_string = format(ord_value, "x") # turning the int into its hex value
            if len(hex_string) == 2:
                unicode_code = "\\x"+hex_string
            elif len(hex_string) == 3:
                unicode_code = "\\u0"+hex_string
            else:
                unicode_code = "\\u"+hex_string
            result += unicode_code
    return result

get_unicode_code(d)将返回与d.encode('unicode-escape').decode('ascii')相同的结果，尽管它很可能效率较低。在

它以一个字符串作为参数，并返回一个带有unicode的字符串，而不是它所表示的字符。在

网友

2楼 · 编辑于 2024-10-04 01:36:18

这称为“unicode转义”编码。下面是一个在python3中实现这种行为的示例：

In [11]: c = b'\xe5\xb8\x90\xe6\x88\xb7'

In [12]: d = c.decode('utf8')

In [13]: print(d)
帐户

In [14]: print(d.encode('unicode-escape').decode('ascii'))
\u5e10\u6237

如果您希望它是bytes，而不是{}，那么您可以简单地去掉{}。在

相关问题更多 >

编程相关推荐

热门问题

热门文章