当我们遇到任何类型的错误（或超时）时，如何通过代理继续尝试urllib2请求？

import urllib2,re proxy = "*.*.*.*:8080" proxies = {"http":"http://%s" % proxy} headers={'User-agent' : 'Mozilla/5.0'} //rest of code here for num,cname in enumerate(match): r = re.compile('epi/(.*?)/') m = r.search(cname[0]) episodeId = m.group(1) url = "http://api.somesite.net/api/data/Episode/"+str(episodeId); proxy_support = urllib2.ProxyHandler(proxies) opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler(debuglevel=0)) urllib2.install_opener(opener) req = urllib2.Request(url, None, headers) try: html = urllib2.urlopen(req).read() except urllib2.URLError, e: raise MyException("There was an error: %r" % e) @retry(urllib2.URLError, tries=4, delay=3, backoff=2) def urlopen_with_retry(): return urllib2.urlopen("http://example.com")

1条回答

网友

1楼 · 发布于 2024-09-26 17:44:32

正如我们在评论中所讨论的，您可以使用try/except来避免循环时崩溃（我看到在这个建议之后，您已经更改了原来的代码）

然后，当使用urlopen（请参见documentation）时，可以指定更长的超时（以秒为单位）。在

此外，在for循环中，您可以添加另一个循环，该循环将尝试检索特定次数的数据，或在urlopen获得所需内容后立即中断。以下代码基于this answer：

# number of attempt urlopen tries to open your url
attempts = 10

for _ in range(attempts):
    try:
        # you can use timeout argument for urlopen
        html = urllib2.urlopen(req, timeout=timeout).read()
        # urlopen successfully get the data
        break
    # urlopen fails to retrieve data
    except urllib2.URLError as err:
        print("Oops! urlopen failed")
# all attempts failed
else:
    print("Oops! All attempts failed")
# now use your html variable here
# ...

对于那些投了反对票的人来说：OP在评论中讨论后改变了他的问题/代码。这个答案是该讨论的后续，所以请考虑上下文。在

相关问题更多 >

编程相关推荐

热门问题

热门文章