<p>首先使用以下css选择器查找表,然后使用<code>pandas</code>读取_html()
并加载到数据帧中。
这将在单个数据帧中提供所有数据</p>
<pre><code>import pandas as pd
import requests
from bs4 import BeautifulSoup
listurl = ['https://ballotpedia.org/Governor_(state_executive_office)', 'https://ballotpedia.org/Lieutenant_Governor_(state_executive_office)', 'https://ballotpedia.org/Secretary_of_State_(state_executive_office)', 'https://ballotpedia.org/Attorney_General_(state_executive_office)']
df1=pd.DataFrame()
for l in listurl:
res=requests.get(l)
soup=BeautifulSoup(res.text,'html.parser')
table=soup.select("table#officeholder-table")[-1]
df= pd.read_html(str(table))[0]
df1=df1.append(df,ignore_index=True)
print(df1)
</code></pre>
<hr/>
<p>如果要获取单个数据帧,请尝试此操作</p>
<pre><code>import pandas as pd
import requests
from bs4 import BeautifulSoup
listurl = ['https://ballotpedia.org/Governor_(state_executive_office)', 'https://ballotpedia.org/Lieutenant_Governor_(state_executive_office)', 'https://ballotpedia.org/Secretary_of_State_(state_executive_office)', 'https://ballotpedia.org/Attorney_General_(state_executive_office)']
for l in listurl:
res=requests.get(l)
soup=BeautifulSoup(res.text,'html.parser')
table=soup.select("table#officeholder-table")[-1]
df= pd.read_html(str(table))[0]
print(df)
</code></pre>