URLLib2.URL错误：读取服务器响应代码（Python）

import urllib2 #List of URLs. The third URL is not a website urls = ["http://www.google.com","http://www.ebay.com/broken-link", "http://notawebsite_broken"] #Empty list to store the output response_codes = [] # Run "for" loop: get server response code and save results to response_codes for url in urls: try: connection = urllib2.urlopen(url) response_codes.append(connection.getcode()) connection.close() print url, ' - ', connection.getcode() except urllib2.HTTPError, e: response_codes.append(e.getcode()) print url, ' - ', e.getcode() print response_codes

3条回答

网友

1楼 · 编辑于 2024-10-01 15:33:06

当urllib2.urlopen（）无法连接到服务器，或无法解析主机的IP时，它将引发一个URLError而不是HTTPError。除了urllib2.HTTPError之外，还需要捕获urllib2.URLError来处理这些情况。在

网友

2楼 · 编辑于 2024-10-01 15:33:06

urllib2库的API是个噩梦。在

包括我在内的许多人强烈建议使用requests软件包：

http://docs.python-requests.org/en/latest/

关于requests的一个好处是，任何请求问题都从基异常类继承。当您使用urllib2“raw”时，除了socket模块和其他一些模块之外，urllib2可以引发许多异常（我不记得了，但它很混乱）

tldr只需使用requests库。在

网友

3楼 · 编辑于 2024-10-01 15:33:06

您可以使用请求：

import requests

urls = ["http://www.google.com","http://www.ebay.com/broken-link",
"http://notawebsite_broken"]

for u in urls:
    try:
        r = requests.get(u)
        print "{} {}".format(u,r.status_code)
    except Exception,e:
        print "{} {}".format(u,e)

http://www.google.com 200
http://www.ebay.com/broken-link 404
http://notawebsite_broken HTTPConnectionPool(host='notawebsite_broken', port=80): Max retries exceeded with url: /

相关问题更多 >

编程相关推荐

热门问题

热门文章