<p>您肯定希望熟悉包<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/" rel="nofollow noreferrer">BeautifulSoup</a>,它允许您用python导航网页的内容。你知道吗</p>
<p><strong>使用BeautifulSoup:</strong><br/></p>
<pre><code>import requests
from bs4 import BeautifulSoup
url = 'https://hypeauditor.com/top-instagram/'
r = requests.get(url)
html = r.text
soup = BeautifulSoup(html, 'html.parser')
top_bloggers = soup.find('table', id="bloggers-top-table")
table_body = top_bloggers.find('tbody')
rows = table_body.find_all('tr')
# For all data:
# Will retrieve a list of lists, good for inputting to pandas
data=[]
for row in rows:
cols = row.find_all('td')
cols = [ele.text.strip() for ele in cols]
data.append([ele for ele in cols if ele]) # Get rid of empty values
# For just handles:
# Will retrieve a list of handles, only
handles=[]
for row in rows:
cols = row.find_all('td')
values = cols[3].text.strip().split('\n')
handles.append(values[-1])
</code></pre>
<blockquote>
<p><em>The for loop I use for rows is sourced from this <a href="https://stackoverflow.com/a/23377804">answer</a></em></p>
</blockquote>