使用grequests向sourceforge发出数千个get请求，get“url超过最大重试次数”

ConnectionError: HTTPConnectionPool(host='sourceforge.net', port=80): Max retries exceeded with url: /api/project/name/p2p-fs/json (Caused by <class 'socket.gaierror'>: [Errno 8] nodename nor servname provided, or not known) <Greenlet at 0x109b790f0: <bound method AsyncRequest.send of <grequests.AsyncRequest object at 0x10999ef50>>(stream=False)> failed with ConnectionError

2条回答

网友

1楼 · 编辑于 2024-06-30 17:00:25

这个可以很容易地更改为使用任意数量的连接。

MAX_CONNECTIONS = 100 #Number of connections you want to limit it to
# urlsList: Your list of URLs. 

results = []
for x in range(1,pages+1, MAX_CONNECTIONS):
    rs = (grequests.get(u, stream=False) for u in urlsList[x:x+MAX_CONNECTIONS])
    time.sleep(0.2) #You can change this to whatever you see works better. 
    results.extend(grequests.map(rs)) #The key here is to extend, not append, not insert. 
    print("Waiting") #Optional, so you see something is done.

网友

2楼 · 编辑于 2024-06-30 17:00:25

所以，我在这里回答，也许它会帮助其他人。

在我的例子中，这不是目标服务器的速率限制，而是更简单的事情：我没有显式地关闭响应，因此它们保持套接字打开，python进程没有文件句柄。

我的解决方案（不确定哪一个解决了这个问题——理论上他们两个都应该这么做）是：

在grequests.get中设置stream=False：

rs = (grequests.get(u, stream=False) for u in urls)

读取response.content后显式调用response.close()：

responses = grequests.map(rs)
for response in responses:
      make_use_of(response.content)
      response.close()

注意：仅仅销毁response对象（将None分配给它，调用gc.collect()）是不够的-这没有关闭文件处理程序。

相关问题更多 >

编程相关推荐

热门问题

热门文章