美丽的队伍没有得到输出

import requests from bs4 import BeautifulSoup url = 'https://www.eredmenyek.com/foci/nemetorszag/bundesliga/' oldal = requests.get(url) soup = BeautifulSoup(oldal.text, "lxml") review_table_elem = soup.find_all('div', {'class': 'stats-table-container'}) print(review_table_elem)

2条回答

网友

1楼 · 编辑于 2024-09-29 19:12:27

硒的一个替代品是requests-html。因为你已经熟悉的要求，你将能够很容易地拿起这个。你知道吗

from bs4 import BeautifulSoup
from requests_html import HTMLSession
import requests
session = HTMLSession()
r = session.get('https://www.eredmenyek.com/foci/nemetorszag/bundesliga/')
r.html.render(sleep=5)
soup = BeautifulSoup(r.html.html, "html.parser")
review_table_elem = soup.find_all('div', {'class': 'stats-table-container'})
print(review_table_elem)

网友

2楼 · 编辑于 2024-09-29 19:12:27

与之交互的页面在很大程度上依赖javascript来呈现其内容。您要查找的数据不会出现在使用requests得到的响应中，因为它不会计算javascript。你知道吗

要实现这一点，您将需要使用像seleniumwebdriver这样的东西。下面是一个使用它的解决方案和一个Chrome的无头实例。除了安装selenium模块外，您还需要下载ChromeDriver并更改以下代码以将其指向提取到的位置：

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument(" headless")

driver = webdriver.Chrome(
    options=options, executable_path=r"C:\chromedriver\chromedriver.exe"
)

try:
    driver.get("https://www.eredmenyek.com/foci/nemetorszag/bundesliga/")
    soup = BeautifulSoup(driver.page_source, "html.parser")

    for row in soup.select(".stats-table-container tr"):
        print("\t".join([e.text for e in row.select("td")]))

finally:
    driver.quit()

结果：

1.      Borussia Dortmund       20      15      4       1       51:20   49            
2.      Mönchengladbach 20      13      3       4       41:18   42            
3.      Bayern München  20      13      3       4       44:23   42            
4.      RB Leipzig      20      11      4       5       38:18   37            
5.      Frankfurt       20      9       5       6       40:27   32   
...

相关问题更多 >

编程相关推荐

热门问题

热门文章