有没有一种快速的方法可以通过BS4在Futbin上获取玩家数据?

2024-09-27 21:34:01 发布

您现在位置:Python中文网/ 问答频道 /正文

就个人而言,我正在做一个网络垃圾项目,在futbin.com上搜集玩家数据。 我注意到网站上的统计数据不是在表标签中,而是在div标签中,我想知道是否有一种快速的方法可以一次性而不是一行一行地获取所有这些统计数据

Example

我已经用这个功能在屏幕左侧抓取了信息:

def myhtml(url):
    # use BS4 to get table that has required data
    html = str(BeautifulSoup(requests.get(url).content, 'html.parser').find('div', id='info_content').find("table"))
    # read_html() returns a list, take first one,  first column are attribute name, transpose to build DF
    return pd.read_html(html)[0].set_index(0).T

现在,我正在寻找所有的个人数据。在我下面的代码中,我只对 卡上可用的“关键”统计信息。 到目前为止我拥有的统计数据的函数:

def stats_scraper(url):
    # Empty lists to save stats in 
    # Pace
    pace_list = []
    shooting_list = []
    passing_list = []
    dribbling_list = []
    defending_list = []
    physical_list = []
    # Looping through all the links
    for link in url:
        page = requests.get(link)
        soup = BeautifulSoup(page.content, 'html.parser')
        # Find the player stats
        pace = soup.find_all('div', id='main-pace-val-0')
        shooting = soup.find_all('div', id='main-shooting-val-0')
        passing = soup.find_all('div', id='main-passing-val-0') 
        dribbling = soup.find_all('div', id='main-dribblingp-val-0') 
        defending = soup.find_all('div', id='main-defending-val-0') 
        physical = soup.find_all('div', id='main-heading-val-0')
        # Looping through every stat
        for stat in pace:
            try:
              pace_list.append(stat.text.strip())
            except AttributeError:
              pace_list.append(np.nan)

        for stat in shooting:
            try:
              shooting_list.append(stat.text.strip())
            except AttributeError:
              shooting_list.append(np.nan)

        for stat in passing:
            try:
              passing_list.append(stat.text.strip())
            except AttributeError:
              passing_list.append(np.nan)

        for stat in dribbling:
            try:
              dribbling_list.append(stat.text.strip())
            except AttributeError:
              dribbling_list.append(np.nan)
        
        for stat in defending:
            try:
              defending_list.append(stat.text.strip())
            except AttributeError:
              defending_list.append(np.nan)

        for stat in physical:
            try:
              physical_list.append(stat.text.strip())
            except AttributeError:
              physical_list.append(np.nan)
        
    stats_frame = pd.DataFrame({
      'pace':pace_list,
      'shooting':shooting_list,
      'passing':passing_list,
      'dribbling':dribbling_list,
      'defending':defending_list,
      'physical':physical_list
    })
    return stats_frame

我想知道是否有一种快速的方法可以用更少的代码获取所有的统计数据。提前谢谢


Tags: individforallfindstatlist
1条回答
网友
1楼 · 发布于 2024-09-27 21:34:01

您可以使用.find_next()获取各种足球运动员统计数据的值:

import requests
import pandas as pd
from bs4 import BeautifulSoup


url = "https://www.futbin.com/21/player/541/lionel-messi"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

data = {}
for s in soup.select(".left_stat_name"):
    data[s.get_text(strip=True)] = s.find_next(class_="stat_val").get_text(
        strip=True
    )

print(pd.DataFrame([data]).T)

印刷品:

                   0
Pace              85
Acceleration      91
Sprint Speed      80
Shooting          92
Positioning       93
Finishing         95
Shot Power        86
Long Shots        94
Volleys           88
Penalties         75
Passing           91
Vision            95
Crossing          85
FK. Accuracy      94
Short Passing     91
Long Passing      91
Curve             93
Dribbling         96
Agility           91
Balance           95
Reactions         94
Ball Control      96
Composure         96
Defending         38
Interceptions     40
Heading Accuracy  70
Def. Awareness    32
Standing Tackle   35
Sliding Tackle    24
Physicality       65
Jumping           68
Stamina           72
Strength          69
Aggression        44

相关问题 更多 >

    热门问题