Beautiful Soup 4 findall（）与<img>标记中的元素不匹配

import webbrowser, time, sys, requests, os, bs4 # Not all libraries are used in this code snippet from selenium import webdriver browser = webdriver.Firefox() browser.get("https://imgur.com/t/lenovo/mLwnorj") res = requests.get(https://imgur.com/t/lenovo/mLwnorj) res.raise_for_status() soup = bs4.BeautifulSoup(res.text, features="html.parser") imageElement = soup.findAll('img', {'class': 'post-image-placeholder'}) print(imageElement)

2条回答

网友

1楼 · 编辑于 2024-10-02 04:25:44

如果网站将在页面加载后插入对象，则需要使用Selenium而不是requests。你知道吗

from bs4 import BeautifulSoup
from selenium import webdriver

url = 'https://imgur.com/t/lenovo/mLwnorj'
browser = webdriver.Firefox()
browser.get(url)
soup = BeautifulSoup(browser.page_source, 'html.parser')
images = soup.find_all('img', {'class': 'post-image-placeholder'})

[print(image['src']) for image in images]

# //i.imgur.com/JfLsH5yr.jpg
# //i.imgur.com/lLcKMBzr.jpg

网友

2楼 · 编辑于 2024-10-02 04:25:44

这里的基本问题似乎是当第一次加载页面时实际的<img ...>元素不存在。在我看来，最好的解决方案是利用SeleniumWebDriver，您已经可以使用它来获取图像。Selenium将允许页面正确呈现（使用JavaScript和all），然后定位您关心的任何元素。你知道吗

例如：

import webbrowser, time, sys, requests, os, bs4      # Not all libraries are used in this code snippet
from selenium import webdriver

# For pretty debugging output
import pprint


browser = webdriver.Firefox()
browser.get("https://imgur.com/t/lenovo/mLwnorj")

# Give the page up to 10 seconds of a grace period to finish rendering
# before complaining about images not being found.
browser.implicitly_wait(10)

# Find elements via Selenium's search
selenium_image_elements = browser.find_elements_by_css_selector('img.post-image-placeholder')
pprint.pprint(selenium_image_elements)

# Use page source to attempt to find them with BeautifulSoup 4
soup = bs4.BeautifulSoup(browser.page_source, features="html.parser")

soup_image_elements = soup.findAll('img', {'class': 'post-image-placeholder'})
pprint.pprint(soup_image_elements)

~~我不能说我已经测试了这段代码，~~但是一般的概念应该有用。你知道吗

更新：

我继续进行测试，修复了代码中的一些错误，然后得到了我希望看到的结果：

相关问题更多 >

编程相关推荐

热门问题

热门文章