Gmail邮件正文中存在unicode字符时的python解析

p = FeedParser() p.feed(msg) msg = p.close() attachments = [] body = None for part in msg.walk(): if part.get_content_type().startswith('multipart/'): continue try: filename = part.get_filename() except: # unicode letters in filename, set default name then filename = 'Mail attachment' if part.get_content_type() == "text/plain" and not body: body = part.get_payload(decode=True) elif filename is not None: content_type = part.get_content_type() attachments.append(ContentFile(part.get_payload(decode=True), filename)) if body is None: body = ''

3条回答

网友

1楼 · 编辑于 2024-09-30 22:15:13

您可能需要尝试使用以下方法：

from email.Iterators import typed_subpart_iterator


def get_charset(message, default="ascii"):
    """Get the message charset"""

    if message.get_content_charset():
        return message.get_content_charset()

    if message.get_charset():
        return message.get_charset()

    return default

def get_body(message):
    """Get the body of the email message"""

    if message.is_multipart():
        #get the plain text version only
        text_parts = [part
                      for part in typed_subpart_iterator(message,
                                                         'text',
                                                         'plain')]
        body = []
        for part in text_parts:
            charset = get_charset(part, get_charset(message))
            body.append(unicode(part.get_payload(decode=True),
                                charset,
                                "replace"))

        return u"\n".join(body).strip()

    else: # if it is not multipart, the payload will be a string
          # representing the message body
        body = unicode(message.get_payload(decode=True),
                       get_charset(message),
                       "replace")
        return body.strip()

网友

2楼 · 编辑于 2024-09-30 22:15:13

嗯，我自己找到了解决办法。我现在会做一些测试，如果有什么失败的话，现在就让你们来。在

我需要再次解码尸体：

body = part.get_payload(decode=True).decode(part.get_content_charset())

网友

3楼 · 编辑于 2024-09-30 22:15:13

您可能想看看^{}（但不确定它能解决您的编码问题）。在

相关问题更多 >

编程相关推荐

热门问题

热门文章