地址是:“https://planningapi.agileapplications.co.uk/api/application/search?reference=GDO+19%2F12”
我可以通过Python请求库轻松下载此页面:
headers = {
'x-client': 'EXMOOR',
'x-product': 'CITIZENPORTAL',
'x-service': 'PA',
}
url='https://planningapi.agileapplications.co.uk/api/application/search?reference=GDO+19%2F12'
resp = requests.get(url, headers=headers)
或者我可以通过CURL轻松下载页面:
curl 'https://planningapi.agileapplications.co.uk/api/application/search?reference=GDO+19%2F12' -H 'x-product: CITIZENPORTAL' -H 'x-service: PA' -H 'x-client: EXMOOR'
它们都返回状态200结果:
{"total":1,"results":[{"id":18468,"reference":"GDO 19/12","proposal":"Prior notification for excavations to bury tanks and trenches to lay water pipes","location":"Land North West of North and South Ley, Exford, Minehead, Somerset.","username":"","applicantSurname":"Mr & Mrs M Burnett","agentName":"JCH Planning Limited","decisionText":null,"registrationDate":"2019-10-04","decisionDate":"2019-10-30","finalGrantDate":null,"appealLodgedDate":null,"appealDecisionDate":null,"areaId":[],"wardId":[],"parishId":[3],"responded":null,"lastLetterDate":null,"targetResponseDate":null}]}
但是Scrapy返回状态500错误:
formdata = {'reference': 'GDO 19/12', }
headers = {
'x-client': 'EXMOOR',
'x-product': 'CITIZENPORTAL',
'x-service': 'PA',
}
fr = scrapy.FormRequest(
url='https://planningapi.agileapplications.co.uk/api/application/search',
method='GET',
meta=response.meta,
headers=headers,
formdata=formdata,
dont_filter=True,
callback=self.ref_result_2,
)
yield fr
可能是因为Scrapy将标题键大写(我尝试过将它们取消大写,但Twisted也这样做了——它再次将它们大写),可能是出于其他原因
如何调整我的Scrapy 1.8.0代码以成功获得与Python请求相同的结果
这实际上是因为Scrapy将标题字段大写。如果您尝试在cURL命令中大写,那么您将得到与Scrapy相同的错误(您可以在spider类中的Scrapy设置
handle_httpstatus_list
中测试它,并在parse方法中打印response.text
)。正如您已经说过的,Twisted也会这样做,因此覆盖scrapy.http.Headers
不是解决方案但是,您可以按照this issue comment的说明,使用技巧使Twisted不大写特定的头:
现在您将得到结果。另一方面,根据第3.2节RFC 7230,标题字段应不区分大小写
相关问题 更多 >
编程相关推荐