UTF-8在Python日志中是如何实现的？

import logging def logging_test(): handler = logging.FileHandler("/home/ted/logfile.txt", "w", encoding = "UTF-8") formatter = logging.Formatter("%(message)s") handler.setFormatter(formatter) root_logger = logging.getLogger() root_logger.addHandler(handler) root_logger.setLevel(logging.INFO) # This is an o with a hat on it. byte_string = '\xc3\xb4' unicode_string = unicode("\xc3\xb4", "utf-8") print "printed unicode object: %s" % unicode_string # Explode root_logger.info(unicode_string) if __name__ == "__main__": logging_test()

3条回答

网友

1楼 · 编辑于 2024-05-20 16:45:56

检查您是否有最新的Python2.6—自从2.6发布以来，已经发现并修复了一些Unicode错误。例如，在我的Ubuntu Jaunty系统上，我运行了复制并粘贴的脚本，只从日志文件名中删除了“/home/ted/”前缀。结果（从终端窗口复制并粘贴）：

vinay@eta-jaunty:~/projects/scratch$ python --version
Python 2.6.2
vinay@eta-jaunty:~/projects/scratch$ python utest.py 
printed unicode object: ô
vinay@eta-jaunty:~/projects/scratch$ cat logfile.txt 
ô
vinay@eta-jaunty:~/projects/scratch$

在Windows框上：

C:\temp>python --version
Python 2.6.2

C:\temp>python utest.py
printed unicode object: ô

文件内容：

alt text

这也可以解释为什么伦纳特·雷杰布罗也不能复制它。

网友

2楼 · 编辑于 2024-05-20 16:45:56

代码如下：

raise Exception(u'щ')

引起：

  File "/usr/lib/python2.7/logging/__init__.py", line 467, in format
    s = self._fmt % record.__dict__
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-3: ordinal not in range(128)

发生这种情况是因为格式字符串是字节字符串，而一些格式字符串参数是带有非ASCII字符的unicode字符串：

>>> "%(message)s" % {'message': Exception(u'\u0449')}
*** UnicodeEncodeError: 'ascii' codec can't encode character u'\u0449' in position 0: ordinal not in range(128)

将格式字符串设置为unicode可以解决以下问题：

>>> u"%(message)s" % {'message': Exception(u'\u0449')}
u'\u0449'

因此，在日志配置中，将所有格式字符串设置为unicode：

'formatters': {
    'simple': {
        'format': u'%(asctime)-s %(levelname)s [%(name)s]: %(message)s',
        'datefmt': '%Y-%m-%d %H:%M:%S',
    },
 ...

并修补默认的logging格式化程序以使用unicode格式字符串：

logging._defaultFormatter = logging.Formatter(u"%(message)s")

网友

3楼 · 编辑于 2024-05-20 16:45:56

我在Python3中运行Django时也遇到了类似的问题：我的记录器在遇到一些巫术（äüßß）时死亡，但其他方面都很好。我查阅了许多结果，发现没有任何效果。我试过了

import locale; 
if locale.getpreferredencoding().upper() != 'UTF-8': 
    locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')

我从上面的评论中得到的。它不起作用。看看当前的语言环境，我得到了一些疯狂的ANSI东西，结果基本上就是“ASCII”的意思。这让我走错了方向。

将日志格式字符串更改为Unicode将没有帮助。在脚本的开头设置一个神奇的编码注释没有帮助。在发件人的邮件上设置字符集（文本来自HTTP请求）没有帮助。

所做的工作是在settings.py中将文件处理程序的编码设置为UTF-8。因为我没有设置任何内容，所以默认值将变成None。最后很明显是ASCII（或者像我想的那样：ASS-KEY）

    'handlers': {
        'file': {
            'level': 'DEBUG',
            'class': 'logging.handlers.TimedRotatingFileHandler',
            'encoding': 'UTF-8', # <-- That was missing.
            ....
        },
    },

相关问题更多 >

编程相关推荐

热门问题

热门文章