<p>如果您只是想通过URL获取文件的内容长度,可以通过只下载HTTP头并检查<code>Content-Length</code>字段来实现:</p>
<pre><code>import requests
url='https://commons.wikimedia.org/wiki/File:Leptocorisa_chinensis_(20566589316).jpg'
http_response = requests.get(url)
print(f"Size of image {url} = {http_response.headers['Content-Length']} bytes")
</code></pre>
<p>但是,如果图像在发送之前由服务器压缩,<a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Length" rel="nofollow noreferrer">^{<cd1>}</a>字段将包含压缩文件大小(实际下载的数据量),而不是未压缩的图像大小。你知道吗</p>
<p>要对给定页面上的所有图像执行此操作,可以使用<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/" rel="nofollow noreferrer">BeautifulSoup HTML processing library</a>提取页面上所有图像的URL列表,并检查文件大小,如下所示:</p>
<pre><code>from time import sleep
import requests
from bs4 import BeautifulSoup as Soup
url='https://en.wikipedia.org/wiki/Agent_Orange'
html = Soup(requests.get(url).text)
image_links = [(url + a['href']) for a in html.find_all('a', {'class': 'image'})]
for img_url in image_links:
response = requests.get(img_url)
try:
print(f"Size of image {img_url} = {response.headers['Content-Length']} bytes")
except KeyError:
print(f"Server didn't specify content length in headers for {img_url}")
sleep(0.5)
</code></pre>
<p>您必须根据您的特定问题来调整它,并且可能必须将其他参数传递给<a href="https://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all" rel="nofollow noreferrer">^{<cd3>}</a>,以便将它缩小到您感兴趣的特定图像,但是类似的操作将实现您所要做的。你知道吗</p>