擅长:python、mysql、java
<p>因为电子邮件是受保护的。我只添加了电子邮件抓取部分。不要添加excel部分,因为你没有问题。将受保护的电子邮件转换为文本贷记将转到<a href="https://stackoverflow.com/a/36913154/7518304">https://stackoverflow.com/a/36913154/7518304</a></p>
<pre><code>emailList= []
r=0
#add url of the page you want to scrape to urlString
urlString='http://www.uschess.org/assets/msa_joomla/AffiliateSearch/clubresultsnew.php?st=AL'
def decodeEmail(e): #https://stackoverflow.com/a/36913154/7518304
de = ""
k = int(e[:2], 16)
for i in range(2, len(e)-1, 2):
de += chr(int(e[i:i+2], 16)^k)
return de
#function that extracts all emails from a page you provided and stores them in a list
def emailExtractor(urlString):
getH=requests.get(urlString)
h=getH.content
soup=BeautifulSoup(h,'html.parser')
mailtos = soup.select('a[href]')
for i in mailtos:
href=i['href']
if "email-protect" in href:
emailList.append(decodeEmail(href.split("#")[1]))
emailExtractor(urlString)
emailList
</code></pre>