503尝试使用python访问Google专利时出错

2024-10-01 13:36:09 发布

您现在位置:Python中文网/ 问答频道 /正文

今天我可以使用谷歌的专利代码

import urllib2

url = 'http://www.google.com/search?tbo=p&q=ininventor:"John-Mudd"&hl=en&tbm=pts&source=lnt&tbs=ptso:us'
req = urllib2.Request(url, headers={'User-Agent' : "foobar"})

response = urllib2.urlopen(req)

现在当我运行它时,我得到了下面的503错误。我只在上面循环了大概30次代码(我试图获得30个人的所有专利)。在

^{pr2}$

Tags: 代码importcomhttpurlsearchwwwgoogle
2条回答

暗中射击猜测:

您是否查看了响应中是否有“Retry After header”。503的确有可能。在

From RFC 2616

14.37 Retry-After

The Retry-After response-header field can be used with a 503 (Service Unavailable) response to indicate how long the service is expected to be unavailable to the requesting client. This field MAY also be used with any 3xx (Redirection) response to indicate the minimum time the user-agent is asked wait before issuing the redirected request. The value of this field can be either an HTTP-date or an integer number of seconds (in decimal) after the time of the response. Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds )

Two examples of its use are Retry-After: Fri, 31 Dec 1999 23:59:59 GMT Retry-After: 120

In the latter example, the delay is 2 minutes.

不幸的是,谷歌的TOS禁止自动查询。几乎可以肯定的是,它发现你“做得不好”

来源:https://support.google.com/websearch/answer/86640?hl=en

相关问题 更多 >