如何在我的csv文件中保留unicode字符代码？

with open('user-messages.csv', 'wb') as myfile: wr = csv.writer(myfile, dialect='excel', encoding='utf-8', quoting=csv.QUOTE_ALL) for _msg in userMessages: wr.writerow([_msg])

1条回答

网友

1楼 · 发布于 2024-09-30 20:34:22

我真的鼓励现在就用它们的含义替换这些Unicode字符，而不是将Unicode保留为字符串（这可以通过添加转义字符\并在以后转换来完成）

使用unicodedata.name()方法可以轻松地将Unicode替换为其含义，如下所示：

import unicodedata

def normalize_unicode(text):
    output = []
    for word in text.split(' '):
        try:
            meaning = unicodedata.name(word).lower()
            output.append(meaning)
        except TypeError:
            output.append(word)
    return " ".join(output)

让我们测试一下这个函数：

>>> x = "I'm happy \U0001f604"
>>> normalize_unicode(x)
I'm happy smiling face with open mouth and smiling eyes

现在，让我们看看如何在代码中使用此方法：

with open('user-messages.csv', 'wb') as myfile:
        wr = csv.writer(myfile, dialect='excel', encoding='utf-8', quoting=csv.QUOTE_ALL)
        for _msg in userMessages:
            wr.writerow([ normalize_unicode(_msg) ])     #<  can be added here
print(normalize_unicode(x))

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在我的csv文件中保留unicode字符代码？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >