使用Python抓取多个Web页面

2024-10-01 04:45:32 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从多个具有类似url的网站上抓取,比如https://woollahra.ljhooker.com.au/our-teamhttps://chinatown.ljhooker.com.au/our-team和{a3}。在

我已经写了一个脚本,为第一个网站工作,但我不确定如何告诉它从其他两个网站刮。在

我的代码:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = "https://woollahra.ljhooker.com.au/our-team"

page_soup = soup(page_html, "html.parser")  
containers = page_soup.findAll("div", {"class":"team-details"})

for container in containers:
    agent_name = container.findAll("div", {"class":"team-name"})
    name = agent_name[0].text

    phone = container.findAll("span", {"class":"phone"})
    mobile = phone[0].text

    print("name: " + name)
    print("mobile: " + mobile)

有没有一种方法可以让我简单地列出url的不同部分(woollahra、chinatown、bondibeach),这样脚本就可以使用我已经编写的代码循环访问每个网页了?在


Tags: namehttpscomurl网站containerpageour
2条回答
locations = ['woollahra', 'chinatown', 'bondibeach']
for location in locations:
    my_url = 'https://' + location + '.ljhooker.com.au/our-team'

接下来是查看列表中每个元素的代码的其余部分,以后可以添加更多位置

你只需要一个循环

for team in ["woollahra", "chinatown", "bondibeach"]:
    my_url = "https://{}.ljhooker.com.au/our-team".format(team)
    page_soup = soup(page_html, "html.parser")  

    # make sure you indent the rest of the code 

相关问题 更多 >