404使用python urllib2处理url时收到错误

2024-06-25 22:41:14 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试获取以下url:ow dotly/LApK30cbLKj,它正在工作,但我收到http 404错误:

            my_url = 'ow' + '.ly/LApK30cbLKj'     # SO won't accept an ow.ly url
            headers = {'User-Agent' : user_agent } 
            request = urllib2.Request(my_url,"", headers)

            response = None
            try: 
                response = urllib2.urlopen(request)
            except urllib2.HTTPError, e:
                print '+++HTTPError = ' + str(e.code)

当我在浏览器中访问时,我能做些什么来获得具有http200状态的url吗?在


Tags: httpurlsoresponserequestmy错误ly
3条回答

您的示例对我很有用,但您需要添加http://

my_url = 'http://ow' + '.ly/LApK30cbLKj'

你需要定义url的协议,问题是当你在浏览器中访问url时,默认的协议是HTTP。但是,urllib2并不适合您,您需要在url的开头添加http://,否则将引发错误:

ValueError: unknown url type: ow.ly/LApK30cbLKj

正如@enjoi提到的,我使用的请求:

import requests

result = None
            try:
                result = requests.get(agen_cont.source_url)
            except requests.exceptions.Timeout as e:
                print '+++timeout exception: ' 
                print e
            except requests.exceptions.TooManyRedirects as e:
                print '+++ too manuy redirects exception: ' 
                print e
            except requests.exceptions.RequestException as e:
                print '+++ request exception: ' 
                print e
            except Exception:
                import traceback
                print '+++generic exception: ' + traceback.format_exc()

            if result:
                final_url = result.url
                print final_url
                response = result.content

相关问题 更多 >