就个人而言,我正在做一个网络垃圾项目,在futbin.com上搜集玩家数据。 我注意到网站上的统计数据不是在表标签中,而是在div标签中,我想知道是否有一种快速的方法可以一次性而不是一行一行地获取所有这些统计数据
我已经用这个功能在屏幕左侧抓取了信息:
def myhtml(url):
# use BS4 to get table that has required data
html = str(BeautifulSoup(requests.get(url).content, 'html.parser').find('div', id='info_content').find("table"))
# read_html() returns a list, take first one, first column are attribute name, transpose to build DF
return pd.read_html(html)[0].set_index(0).T
现在,我正在寻找所有的个人数据。在我下面的代码中,我只对 卡上可用的“关键”统计信息。 到目前为止我拥有的统计数据的函数:
def stats_scraper(url):
# Empty lists to save stats in
# Pace
pace_list = []
shooting_list = []
passing_list = []
dribbling_list = []
defending_list = []
physical_list = []
# Looping through all the links
for link in url:
page = requests.get(link)
soup = BeautifulSoup(page.content, 'html.parser')
# Find the player stats
pace = soup.find_all('div', id='main-pace-val-0')
shooting = soup.find_all('div', id='main-shooting-val-0')
passing = soup.find_all('div', id='main-passing-val-0')
dribbling = soup.find_all('div', id='main-dribblingp-val-0')
defending = soup.find_all('div', id='main-defending-val-0')
physical = soup.find_all('div', id='main-heading-val-0')
# Looping through every stat
for stat in pace:
try:
pace_list.append(stat.text.strip())
except AttributeError:
pace_list.append(np.nan)
for stat in shooting:
try:
shooting_list.append(stat.text.strip())
except AttributeError:
shooting_list.append(np.nan)
for stat in passing:
try:
passing_list.append(stat.text.strip())
except AttributeError:
passing_list.append(np.nan)
for stat in dribbling:
try:
dribbling_list.append(stat.text.strip())
except AttributeError:
dribbling_list.append(np.nan)
for stat in defending:
try:
defending_list.append(stat.text.strip())
except AttributeError:
defending_list.append(np.nan)
for stat in physical:
try:
physical_list.append(stat.text.strip())
except AttributeError:
physical_list.append(np.nan)
stats_frame = pd.DataFrame({
'pace':pace_list,
'shooting':shooting_list,
'passing':passing_list,
'dribbling':dribbling_list,
'defending':defending_list,
'physical':physical_list
})
return stats_frame
我想知道是否有一种快速的方法可以用更少的代码获取所有的统计数据。提前谢谢
您可以使用
.find_next()
获取各种足球运动员统计数据的值:印刷品:
相关问题 更多 >
编程相关推荐