我正在尝试创建代码,从archive.org中抓取和下载特定文件。当我运行程序时,我遇到了这个代码错误
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\ROMS\Gamecube\main.py", line 16, in <module>
response = requests.get(DOMAIN + file_link)
File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "C:\Users\cycle\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='archive.org007%20-%20agent%20under%20fire%20%28usa%29.nkit.gcz', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x043979B8>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
这是我的代码:
from bs4 import BeautifulSoup as bs
import requests
DOMAIN = 'https://archive.org'
URL = 'https://archive.org/download/GCRedumpNKitPart1'
FILETYPE = '%28USA%29.nkit.gcz'
def get_soup(url):
return bs(requests.get(url).text, 'html.parser')
for link in get_soup(URL).find_all('a'):
file_link = link.get('href')
if FILETYPE in file_link:
print(file_link)
with open(link.text, 'wb') as file:
response = requests.get(DOMAIN + file_link)
file.write(response.content)
您只是在
https://archive.org
之后忘记了/
,因此创建了错误的URL在域的末尾添加
/
或者稍后添加
/
或者使用
urllib.parse.urljoin()
创建URL相关问题 更多 >
编程相关推荐