Python Scraper无法刮取img s

from bs4 import BeautifulSoup import requests scraper = cfscrape.create_scraper() url = "http://kissmanga.com/Manga/Bleach/Bleach-634--Friend-004?id=235206" response = requests.get(url) soup2 = BeautifulSoup(response.text, 'html.parser') divImage = soup2.find('div',{"id": "divImage"}) for img in divImage.findAll('img'): print(img) response.close()

2条回答

网友

1楼 · 编辑于 2024-07-01 06:53:08

您需要等待JavaScript为图像注入html代码。在

有多种工具可以做到这一点，以下是其中一些工具：

我能让它和硒一起工作：

from bs4 import BeautifulSoup

from selenium import webdriver
from selenium.common.exceptions import TimeoutException

driver = webdriver.Firefox()
# it takes forever to load the page, therefore we are setting a threshold
driver.set_page_load_timeout(5)

try:
    driver.get("http://kissmanga.com/Manga/Bleach/Bleach-634 Friend-004?id=235206")
except TimeoutException:
    # never ignore exceptions silently in real world code
    pass

soup2 = BeautifulSoup(driver.page_source, 'html.parser')
divImage = soup2.find('div', {"id": "divImage"})

# close the browser 
driver.close()

for img in divImage.findAll('img'):
    print img.get('src')

如果您还想下载这些图像，请参阅How to download image using requests。在

网友

2楼 · 编辑于 2024-07-01 06:53:08

你试过设置custom user-agent吗？这样做通常被认为是不道德的，但刮漫画也是如此。在

相关问题更多 >

编程相关推荐

热门问题

热门文章