Scrapy 1.8.0返回错误500，但Python代码返回成功200

headers = { 'x-client': 'EXMOOR', 'x-product': 'CITIZENPORTAL', 'x-service': 'PA', } url='https://planningapi.agileapplications.co.uk/api/application/search?reference=GDO+19%2F12' resp = requests.get(url, headers=headers)

{"total":1,"results":[{"id":18468,"reference":"GDO 19/12","proposal":"Prior notification for excavations to bury tanks and trenches to lay water pipes","location":"Land North West of North and South Ley, Exford, Minehead, Somerset.","username":"","applicantSurname":"Mr & Mrs M Burnett","agentName":"JCH Planning Limited","decisionText":null,"registrationDate":"2019-10-04","decisionDate":"2019-10-30","finalGrantDate":null,"appealLodgedDate":null,"appealDecisionDate":null,"areaId":[],"wardId":[],"parishId":[3],"responded":null,"lastLetterDate":null,"targetResponseDate":null}]}

formdata = {'reference': 'GDO 19/12', } headers = { 'x-client': 'EXMOOR', 'x-product': 'CITIZENPORTAL', 'x-service': 'PA', } fr = scrapy.FormRequest( url='https://planningapi.agileapplications.co.uk/api/application/search', method='GET', meta=response.meta, headers=headers, formdata=formdata, dont_filter=True, callback=self.ref_result_2, ) yield fr

1条回答

网友

1楼 · 发布于 2024-09-30 22:15:28

这实际上是因为Scrapy将标题字段大写。如果您尝试在cURL命令中大写，那么您将得到与Scrapy相同的错误（您可以在spider类中的Scrapy设置handle_httpstatus_list中测试它，并在parse方法中打印response.text）。正如您已经说过的，Twisted也会这样做，因此覆盖scrapy.http.Headers不是解决方案

但是，您可以按照this issue comment的说明，使用技巧使Twisted不大写特定的头：

# -*- coding: utf-8 -*-
from pprint import pprint
import scrapy
from twisted.web.http_headers import Headers as TwistedHeaders

TwistedHeaders._caseMappings.update({
    b'x-client': b'x-client',
    b'x-product': b'x-product',
    b'x-service': b'x-service',
})

class Foo(scrapy.Spider):
    name = 'foo'
    handle_httpstatus_list = [500]

    def start_requests(self):
        formdata = {'reference': 'GDO 19/12'}
        headers = {
            'x-client': 'EXMOOR',
            'x-product': 'CITIZENPORTAL',
            'x-service': 'PA'
        }
        yield scrapy.FormRequest(
            'https://planningapi.agileapplications.co.uk/api/application/search',
            method='GET', headers=headers, formdata=formdata, callback=self.parse)

    def parse(self, response):
        pprint(response.text)

现在您将得到结果。另一方面，根据第3.2节RFC 7230，标题字段应不区分大小写

相关问题更多 >

编程相关推荐

热门问题

热门文章