405垃圾python错误,如何解决?

2024-05-08 21:16:09 发布

您现在位置:Python中文网/ 问答频道 /正文

当我运行我的蜘蛛时,我得到了下面的错误

[scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-12-30 01:18:36 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-12-30 01:18:37 [scrapy.core.engine] DEBUG: Crawled (405) <GET https://www.propertyguru.com.sg/robots.txt> (referer: None)
2018-12-30 01:18:37 [scrapy.core.engine] DEBUG: Crawled (405) <GET https://www.propertyguru.com.sg/> (referer: None)
2018-12-30 01:18:38 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <405 https://www.propertyguru.com.sg/>: HTTP status

code is not handled or not allowed


Tags: httpscoredebuginfocomwwwitemsextensions
1条回答
网友
1楼 · 发布于 2024-05-08 21:16:09

您需要在请求中包括User-Agent和{}:

def start_requests(self):
    headers = {'User-Agent': 'your user agent'}
    cookies = {'cookie-key': 'cookie-value'}
    yield scrapy.Request(
        url='https://www.propertyguru.com.sg/',
        method='GET',
        headers=headers,
        cookies=cookies,
        callback=self.parse,
        errback=self.handle_err,
    )

要获得User-Agentcookies,打开googlechorme的开发人员控制台并键入:

navigator.userAgent用于用户代理

document.cookie用于cookies

相关问题 更多 >