如何抓取图像并保存到fi

2024-09-22 16:42:59 发布

您现在位置:Python中文网/ 问答频道 /正文

我不知道如何将刮下的图像保存到桌面上的文件中。你知道吗

我试图从代码中列出的站点下载图像,但我只知道导入BeautifulSoup和Request之类的基础知识。我不明白每件事都意味着什么。你知道吗

from bs4 import BeautifulSoup
import urllib.request as request


folder = r'C:\Users\rlook\Desktop\scrape' + '\\'
url = "https://www.butterfliesofamerica.com/t/Phocides_belus_a.htm"
response = request.urlopen(url)
soup = BeautifulSoup(response, 'html.parser')
for res in soup.findAll('img')

我可以在其他网站上使用一些代码,但无法使其符合我的目的。enter code here

from urllib.request as request
from bs4 import BeautifulSoup

folder = r'C:\Users\rlook\Desktop\scrape' + '\\'
URL ='https://www.butterfliesofamerica.com/t/Phocides_belus_a.htm'
response = request.urlopen(URL)
soup = BeautifulSoup(response, 'html.parser') 

iconTable = soup.find('a', {'class' : 'y'})

request.urlretrieve(icon.img['src'], folder + icon.img['alt'] + '.jpg')

Tags: 代码from图像importimgresponserequestas
1条回答
网友
1楼 · 发布于 2024-09-22 16:42:59

我使用requests和beautifulsoup(以及re和shutil)来获得完整大小的图像,而不仅仅是缩略图。必须在命令行中pip install requestspip install bs4才能使用此解决方案。你知道吗

代码

import requests, re, shutil
from bs4 import BeautifulSoup

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36',
}
base_url = 'https://www.butterfliesofamerica.com'
all_imgs = requests.get(base_url + '/t/Phocides_belus_a.htm', headers=headers)
parsed_imgs = BeautifulSoup(all_imgs.text, 'html.parser')

img_hrefs = [img['href'] for img in parsed_imgs.find_all('a', class_='y')]
for img_href in img_hrefs:
    real_img_href = img_href.replace('..', base_url)
    image_page = requests.get(real_img_href, headers=headers)

    page_soup = BeautifulSoup(image_page.text, 'html.parser')
    source_image = page_soup.find('img')['src']
    img_name = re.search(r'/([\w\-\.]+?\.(?:jpg|JPG))', source_image).group(1)

    img = requests.get(base_url + source_image, stream=True, headers=headers)
    with open(img_name, 'wb') as img_file:
        shutil.copyfileobj(img.raw, img_file)
        print(img_name, ' found')

相关问题 更多 >