Python 3.4 urllib.request错误（http 403）

import urllib.request html = urllib.request.urlopen(url) # same URL as before File "C:\Python34\lib\urllib\request.py", line 153, in urlopen return opener.open(url, data, timeout) File "C:\Python34\lib\urllib\request.py", line 461, in open response = meth(req, response) File "C:\Python34\lib\urllib\request.py", line 574, in http_response 'http', request, response, code, msg, hdrs) File "C:\Python34\lib\urllib\request.py", line 499, in error return self._call_chain(*args) File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain result = func(*args) File "C:\Python34\lib\urllib\request.py", line 582, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 403: Forbidden

2条回答

网友

1楼 · 编辑于 2024-05-08 17:22:51

以下是我在学习python-3时收集到的一些笔记：
我留着它们，以防它们能派上用场或帮助别人。

如何导入`urllib.request`和`urllib.parse`：

import urllib.request as urlRequest
import urllib.parse as urlParse

如何发出GET请求：

url = "http://www.example.net"
# open the url
x = urlRequest.urlopen(url)
# get the source code
sourceCode = x.read()

如何发出POST请求：

url = "https://www.example.com"
values = {"q": "python if"}
# encode values for the url
values = urlParse.urlencode(values)
# encode the values in UTF-8 format
values = values.encode("UTF-8")
# create the url
targetUrl = urlRequest.Request(url, values)
# open the url
x  = urlRequest.urlopen(targetUrl)
# get the source code
sourceCode = x.read()

如何发出POST请求（`403 forbidden`响应）：

url = "https://www.example.com"
values = {"q": "python urllib"}
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
# encode values for the url
values = urlParse.urlencode(values)
# encode the values in UTF-8 format
values = values.encode("UTF-8")
# create the url
targetUrl = urlRequest.Request(url = url, data = values, headers = headers)
# open the url
x  = urlRequest.urlopen(targetUrl)
# get the source code
sourceCode = x.read()

如何发出GET请求（`403 forbidden`响应）：

url = "https://www.example.com"
# pretend to be a chrome 47 browser on a windows 10 machine
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"}
req = urlRequest.Request(url, headers = headers)
# open the url
x = urlRequest.urlopen(req)
# get the source code
sourceCode = x.read()

网友

2楼 · 编辑于 2024-05-08 17:22:51

网站似乎不喜欢Python3.x的用户代理

指定User-Agent将解决您的问题：

import urllib.request
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
html = urllib.request.urlopen(req).read()

注意Python 2.x urllib版本也接收403状态，但与Python 2.x urllib2和Python 3.x urllib不同，它不会引发异常。

您可以通过以下代码确认：

print(urllib.urlopen(url).getcode())  # => 403

如何导入`urllib.request`和`urllib.parse`：

如何发出GET请求：

如何发出POST请求：

如何发出POST请求（`403 forbidden`响应）：

如何发出GET请求（`403 forbidden`响应）：

相关问题更多 >

编程相关推荐

热门问题

热门文章