<p>您可以使用<code>pandas</code>进行此操作。以下是完整的代码:</p>
<pre><code>from bs4 import BeautifulSoup
import requests
import re
import pandas as pd
urlString = 'http://www.uschess.org/assets/msa_joomla/AffiliateSearch/clubresultsnew.php?st=AL'
# function that extracts all emails from a page you provided and stores them in a list
def emailExtractor(urlString):
emailList = []
getH = requests.get(urlString)
h = getH.content
soup = BeautifulSoup(h, 'html.parser')
mailtos = soup.find_all('a')
href_lst = []
for i in mailtos:
href_lst.append(i['href'])
for href in href_lst:
if ':' in href:
emailList.append(href)
print(emailList)
s = pd.Series(emailList)
s = s.rename('Emails')
s.to_excel('D:\\Emails.xls',index=False)
emailExtractor(urlString)
</code></pre>
<p>输出:</p>
<pre><code>['http://msa.uschess.org/AffDtlMain.php?T6006791', 'https://alabamachess.org', 'http://msa.uschess.org/AffDtlMain.php?A6029262', 'http://www.caesarchess.com/', 'http://msa.uschess.org/AffDtlMain.php?A6045660', 'http://msa.uschess.org/AffDtlMain.php?H6046485', 'http://msa.uschess.org/AffDtlMain.php?A6040580']
</code></pre>
<p>Excel工作表屏幕截图:</p>
<p><a href="https://i.stack.imgur.com/zxPC5.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/zxPC5.png" alt="enter image description here"/></a></p>
<p>如果希望将链接作为<code>hyperlinks</code>输出到excel工作表(单击链接后将重定向到网站),则将<code>emailList.append(href)</code>更改为<code>emailList.append('=HYPERLINK("'+href+'")')</code>。
同时,您还应该将文件扩展名更改为<code>.xlsx</code>。只有这样,你才能得到超链接的链接</p>
<p>输出:</p>
<p><a href="https://i.stack.imgur.com/q05IK.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/q05IK.png" alt="enter image description here"/></a></p>
<p>希望这有帮助</p>