创建循环以按顺序打开链接

2024-09-28 20:46:15 发布

您现在位置:Python中文网/ 问答频道 /正文

此站点:
https://int.soccerway.com/international/europe/european-championships/c25/

EUROPE

European Championship
                     2020
                         Group Stage
                         Final Stages
EC Qualification
WC Qualification Europe
UEFA Nations League
Baltic Cup

它在Group StageFinal Stages的左侧菜单区域有两个链接:

https://int.soccerway.com/international/europe/european-championships/2020/group-stage/r38188/
https://int.soccerway.com/international/europe/european-championships/2020/s13030/final-stages/

我正在收集链接,但是当我试图一个接一个地打开页面时,它只会在第一个链接处停止,而不会打开第二个链接,我应该更改什么

url = "https://int.soccerway.com/international/europe/european-championships/c25/"

driver.get(url)
links_level_2 = driver.find_elements_by_xpath("//ul[contains(@class,'level-2')]/li/a")
for link_level_2 in links_level_2:
    level_2 = link_level_2.get_attribute("href")
    driver.get(level_2)

Tags: httpscomget链接drivergrouplevelstage
1条回答
网友
1楼 · 发布于 2024-09-28 20:46:15

使用beautifulsoup可以很容易地做到这一点

由于您没有清楚地提到这些links是什么,我假设您正在试图提取ul下的links,查看我们的代码

下面是使用beautifulsoup的代码它从上面提到的两个链接获取<ul>下的链接

import bs4 as bs
import requests

urls = ['https://int.soccerway.com/international/europe/european-championships/2020/group-stage/r38188/', 'https://int.soccerway.com/international/europe/european-championships/2020/s13030/final-stages/']
headers = {"User-agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36"}

#Code to get the ULs
for url in urls:
    resp = requests.get(url, headers=headers)
    soup = bs.BeautifulSoup(resp.text, 'lxml')
    ls = soup.find('ul', class_='level-2').findAll('li')
    for i in ls:
        print(i.find('a')['href'])
    print('\n')

/international/europe/european-championships/2020/group-stage/r38188/
/international/europe/european-championships/2020/group-stage/group-a/g10136/
/international/europe/european-championships/2020/group-stage/group-b/g10137/
/international/europe/european-championships/2020/group-stage/group-c/g10138/
/international/europe/european-championships/2020/group-stage/group-d/g10139/
/international/europe/european-championships/2020/group-stage/group-e/g10140/
/international/europe/european-championships/2020/group-stage/group-f/g10141/
/international/europe/european-championships/2020/s13030/final-stages/


/international/europe/european-championships/2020/group-stage/r38188/
/international/europe/european-championships/2020/s13030/final-stages/

相关问题 更多 >