我想从这个页面上删除所有的商业链接

2024-09-21 03:18:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我想提取公司的所有链接(而不是标题)。请引导我!谢谢 这是网页的url:https://hipages.com.au/find/antenna_services/nsw/sydney

这是我的密码:

import requests
from bs4 import BeautifulSoup
import re

def get_index_data(soup):
    try:
        links = soup.find_all('a', {'class': 'sc-bZQynM sc-iwsKbI dpKmnV'}).get('href')
    except:
        links = []
    print(links)
def Main():
    r = requests.get("https://hipages.com.au/find/antenna_services/nsw/sydney")
    get_index_data(r)
Main()

Tags: httpsimportcomgetindexdefservicelinks
1条回答
网友
1楼 · 发布于 2024-09-21 03:18:27
import requests
from bs4 import BeautifulSoup

r = requests.get("https://hipages.com.au/find/antenna_services/nsw/sydney")
soup = BeautifulSoup(r.text, 'html.parser')


for item in soup.findAll("h3", {'class': 'sc-bZQynM sc-iwsKbI dpKmnV'}):
    print(f"https://hipages.com.au{item.previous_element.get('href')}")

输出:

https://hipages.com.au/connect/glencoelectricalbuildingmaintenanceairconditioningsecurityalarmscctv
https://hipages.com.au/connect/emcoelectricalservices
https://hipages.com.au/connect/abcelectricservicespl/service/126298
https://hipages.com.au/connect/ozyblindsnscreens
https://hipages.com.au/connect/samedaytvantennaservice
https://hipages.com.au/connect/langenelectricalnsw
https://hipages.com.au/connect/allprohandymanmaintenance
https://hipages.com.au/connect/amateairconditioningrefrigerationservices
https://hipages.com.au/connect/makeurmove
https://hipages.com.au/connect/uberantennas/service/184323
https://hipages.com.au/connect/cmkelectricalanddata
https://hipages.com.au/connect/antennadistributionservicesptyltd
https://hipages.com.au/connect/sydneysparky
https://hipages.com.au/connect/bluediamond
https://hipages.com.au/connect/digiproantennas
https://hipages.com.au/connect/vascom
https://hipages.com.au/connect/sparkyselectricalanddataptyltd
https://hipages.com.au/connect/prosparksolutions

相关问题 更多 >

    热门问题