我试图从一个网站上搜集数据,但表中有两组数据,首先,2-3行数据在AD中,其余在tbody中。我可以很容易地一次只从一个数据中提取数据,当我尝试这两种方法时,我得到了一些错误,比如TypeError、AttributeError。顺便说一句,我正在使用python 这是密码
import requests
from bs4 import BeautifulSoup
import pandas as pd
url="https://www.worldometers.info/world-population/"
r=requests.get(url)
print(r)
html=r.text
soup=BeautifulSoup(html,'html.parser')
print(soup.title.text)
print()
print()
live_data=soup.find_all('div',id='maincounter-wrap')
print(live_data)
for i in live_data:
print(i.text)
table_body=soup.find('thead')
table_rows=table_body.find_all('tr')
table_body_2=soup.find('tbody')
table_rows_2=soup.find_all('tr')
year_july1=[]
population=[]
yearly_change_in_perchantage=[]
yearly_change=[]
median_age=[]
fertillity_rate=[]
density=[]#density (p\km**)
urban_population_in_perchantage=[]
urban_population=[]
for tr in table_rows:
td=tr.find_all('td')
year_july1.append(td[0].text)
population.append(td[1].text)
yearly_change_in_perchantage.append(td[2].text)
yearly_change.append(td[3].text)
median_age.append(td[4].text)
fertillity_rate.append(td[5].text)
density.append(td[6].text)
urban_population_in_perchantage.append(td[7].text)
urban_population.append(td[8].text)
for tr in table_rows_2:
td=tr.find_all('td')
year_july1.append(td[0].text)
population.append(td[1].text)
yearly_change_in_perchantage.append(td[2].text)
yearly_change.append(td[3].text)
median_age.append(td[4].text)
fertillity_rate.append(td[5].text)
density.append(td[6].text)
urban_population_in_perchantage.append(td[7].text)
urban_population.append(td[8].text)
headers=['year_july1','population','yearly_change_in_perchantage','yearly_change','median_age','fertillity_rate','density','urban_population_in_perchantage','urban_population']
data_2= pd.DataFrame(list(zip(year_july1,population,yearly_change_in_perchantage,yearly_change,median_age,fertillity_rate,density,urban_population_in_perchantage,urban_population)),columns=headers)
print(data_2)
data_2.to_csv("C:\\Users\\data_2.csv")
您可以尝试下面的代码,它将生成所需的数据。如果您需要任何澄清,请务必告诉我:-
给我下面的输出
相关问题 更多 >
编程相关推荐