
2024-09-28 19:31:37 发布

您现在位置:Python中文网/ 问答频道 /正文



这就产生了一个问题,因为当文件最终写入数据库时,它会显示为“Côte d'Ivodia”

更新:如果我使用io.open(“Test.html”,“w”)作为f_out:在下面,那么文件包含正确的U+00F4,它显示为“?”,最终的数据库记录仍然显示为“Côte d'Ivoid”,尽管:-(


from __future__ import unicode_literals

import io

line="The current population of Côte d'Ivoire is 26,051,291"
for c in line:
    if ord(c) > 127:
            print(c, c.encode('utf-8').hex())
            line1 = line.replace(u"\uC3B4", "ô")
            line2 = line.replace(c, u"\u00F4")
            line3 = line.replace(c, "ô")

#with io.open("Test.html", "w", encoding="utf-8") as f_out:
    with io.open("Test.html", "w") as f_out:


00000000h: 54 68 65 20 63 75 72 72 65 6E 74 20 70 6F 70 75 ; The current popu
00000010h: 6C 61 74 69 6F 6E 20 6F 66 20 43 C3 B4 74 65 20 ; lation of Côte 
00000020h: 64 27 49 76 6F 69 72 65 20 69 73 20 32 36 2C 30 ; d'Ivoire is 26,0
00000030h: 35 31 2C 32 39 31 0D 0A 54 68 65 20 63 75 72 72 ; 51,291..The curr
00000040h: 65 6E 74 20 70 6F 70 75 6C 61 74 69 6F 6E 20 6F ; ent population o
00000050h: 66 20 43 C3 B4 74 65 20 64 27 49 76 6F 69 72 65 ; f Côte d'Ivoire
00000060h: 20 69 73 20 32 36 2C 30 35 31 2C 32 39 31 0D 0A ;  is 26,051,291..
00000070h: 54 68 65 20 63 75 72 72 65 6E 74 20 70 6F 70 75 ; The current popu
00000080h: 6C 61 74 69 6F 6E 20 6F 66 20 43 C3 B4 74 65 20 ; lation of Côte 
00000090h: 64 27 49 76 6F 69 72 65 20 69 73 20 32 36 2C 30 ; d'Ivoire is 26,0
000000a0h: 35 31 2C 32 39 31 0D 0A 54 68 65 20 63 75 72 72 ; 51,291..The curr
000000b0h: 65 6E 74 20 70 6F 70 75 6C 61 74 69 6F 6E 20 6F ; ent population o
000000c0h: 66 20 43 C3 B4 74 65 20 64 27 49 76 6F 69 72 65 ; f Côte d'Ivoire
000000d0h: 20 69 73 20 32 36 2C 30 35 31 2C 32 39 31 0D 0A ;  is 26,051,291..

Tags: 文件theiotestishtmllineopen
1楼 · 发布于 2024-09-28 19:31:37


>>> l = "ô"  # our text to be ebcoded
>>> "U+%04x" % ord(l)
'U+00f4'  # the code point (ordinal encoded in hex)
>>> l.encode("utf-8")
b'\xc3\xb4'  # the UTF-8 encoded bytes


相关问题 更多 >