Python获取锚文本链接和href值，但忽略图像链接

2024-09-28 17:15:37 发布

男 | 程序猿一只，喜欢编程写python代码。

我使用以下Python代码从页面路径中刮取锚文本链接和相应的href值：

from requests_html import HTMLSession
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests

url="https://www.mydomain.co.uk/contact-us"

session = HTMLSession()
r = session.get(url)

b  = requests.get(url)
soup = BeautifulSoup(b.text, "lxml")

for link in soup.find_all('a'):
    print(link.text, '-', link.get('href'))

它工作正常，但它也会刮取图像链接，如果是图像，则输出“-”。例如：

Contact Us - /contact-us
About Us - /about
- /locations

我希望它忽略任何图像href链接，因此输出为：

Contact Us - /contact-us
About Us - /about

这可能吗

谢谢

Tags： from 图像 import url get 链接 session link

1条回答

网友

1楼 · 发布于 2024-09-28 17:15:37

for link in soup.find_all('a'):
    if not link.find('img'):
        print(link.text, '-', link.get('href'))

Python获取锚文本链接和href值，但忽略图像链接

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python获取锚文本链接和href值，但忽略图像链接

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >