URL的最大重试次数超过(由NewConnection错误引起)

2024-04-26 22:59:20 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试创建代码,从archive.org中抓取和下载特定文件。当我运行程序时,我遇到了这个代码错误

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\ROMS\Gamecube\main.py", line 16, in <module>
    response = requests.get(DOMAIN + file_link)
  File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='archive.org007%20-%20agent%20under%20fire%20%28usa%29.nkit.gcz', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x043979B8>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

这是我的代码:

from bs4 import BeautifulSoup as bs
import requests

DOMAIN = 'https://archive.org'
URL = 'https://archive.org/download/GCRedumpNKitPart1'
FILETYPE = '%28USA%29.nkit.gcz'

def get_soup(url):
    return bs(requests.get(url).text, 'html.parser')

for link in get_soup(URL).find_all('a'):
    file_link = link.get('href')
    if FILETYPE in file_link:
        print(file_link)
        with open(link.text, 'wb') as file:
            response = requests.get(DOMAIN + file_link)
            file.write(response.content)

Tags: inpyurlgetrequestlocallinelink
1条回答
网友
1楼 · 发布于 2024-04-26 22:59:20

您只是在https://archive.org之后忘记了/,因此创建了错误的URL

在域的末尾添加/

DOMAIN = 'https://archive.org/'

或者稍后添加/

response = requests.get(DOMAIN + '/' + file_link)

或者使用urllib.parse.urljoin()创建URL

import urllib.parse

response = requests.get(urllib.parse.urljoin(DOMAIN, file_link))

相关问题 更多 >