如何从BlackBerry 10电子邮件中获取纯文本?

2024-09-21 01:16:07 发布

您现在位置:Python中文网/ 问答频道 /正文

黑莓10设备只发送HTML电子邮件。在

虽然这对于抛弃遗留内容的运动来说“很好”(单独的参数),但是当你需要纯文本的时候就很烦人了。只是没有。在

如何从BB10设备发送的电子邮件中获取纯文本?在


Tags: 文本内容参数电子邮件html黑莓bb10
1条回答
网友
1楼 · 发布于 2024-09-21 01:16:07

使用python和xpath从HTML中提取文本:

#!/usr/bin/python3
import urllib.request
import quopri
import lxml.html

# actual test fragments are here
raw_url = 'https://gist.github.com/Supermathie/7866658/raw/80e4abd4226b916a54b224677af7fda881d0937f/sample+1'
raw_url_no_sig = 'https://gist.github.com/Supermathie/7866658/raw/df354d6b8f3176c3d8bdb89b2961bb0ccc78520c/sample+2'

def get_divs(url):
    email_body_raw = urllib.request.urlopen(url).read()
    email_body = quopri.decodestring(email_body_raw)
    email_xml = lxml.html.document_fromstring(email_body)
    email_divs = email_xml.xpath('//div[@id="_signaturePlaceholder"]/preceding-sibling::div')
    return email_divs

print('\n'.join([str(node.text_content() or "") for node in get_divs(raw_url)]))
print('\n'.join([str(node.text_content() or "") for node in get_divs(raw_url_no_sig)]))

对于两个测试用例,打印:

Let's remember that the information in the article was filtered through no less than two people who don't fully speak tech. I think I can translate it back:

«The FBI crafted a custom piece of malware targeting Mo, designed to snoop his activities . A link was emailed to Mo in a spear phishing attack in an attempt to get hin to download and install the malware from the FBI's monitored servers. 

The attempt failed; the software was downloaded but never executed in a manner enabling the software to send back information to the FBI.»

Nothing too special. I wonder if Mo had the balls to submit the software to Sophos etc. for malware analysis. :)

M.

以及

Test email

No signature

相关问题 更多 >

    热门问题