如何从电子邮件中检索超链接，并访问它？

import imaplib, rfc822, sys from bs4 import BeautifulSoup server ='imap.laposte.net' username='username' password='VeryStrong' M = imaplib.IMAP4(server) M.login(username, password) M.select() typ, data = M.search(None, 'ALL') for num in data[0].split(): typ, data = M.fetch(num, '(RFC822)') pos1=data[0][1][0:1000].find('entre-infideles') if pos1 != -1: print '06ReadImap: Message %s' % (num) pos2=data[0][1][pos1:].find('Subject') pos3=data[0][1][pos1+pos2:].find('Subject: <PUB>') pos4=data[0][1][pos1+pos2+pos3:].find('votre profil') if pos4 != -1: print '06ReadImap: Pos4(votre profil)=%i' % (pos2+pos3+pos4) print data[0][1][pos1+pos2+pos3:pos1+pos2+pos3+pos4+12] soup=BeautifulSoup(data[0][1]) for link in soup.find_all('a'): print(link.get('href')) sys.exit(0)

2条回答

网友

1楼 · 编辑于 2024-10-04 09:30:07

您需要先撤消邮件的内容传输编码。这一个似乎是引用的可打印编码，这让您的HTML解析器很困惑。在

网友

2楼 · 编辑于 2024-10-04 09:30:07

        # quoted_printable_decode python
        result = quopri.decodestring(data[0][1])
        #
        soup=BeautifulSoup(result)
        print "\n         Extracting all the URLs found within page 1’s <a> tags :".encode('utf8')
        i=0
        for link in soup.find_all('a'):
            i=i+1
            print(link.get('href'))

开始了， D

相关问题更多 >

编程相关推荐

热门问题

热门文章