Scrapy spider无法根据args重新定义自定义\u设置

2024-05-19 16:11:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从arg覆盖默认的scrapy设置,但它不起作用。你知道吗

在我的设置.py我有:

EXTENSIONS = {
    'scrapy.extensions.closespider.CloseSpider': 80,
}
CLOSESPIDER_PAGECOUNT = 1

然后我开始像这样蜘蛛:刮擦爬行链接\u提取器3-流=许多

什么也没发生。你知道吗

class LinksExtractor3Spider(CrawlSpider):
    name = 'links_extractor3'
    allowed_domains = ['books.toscrape.com']
    start_urls = ['http://books.toscrape.com/']
    custom_settings = {}

    def __init__(self, *args, **kwargs):
        self.flow = kwargs.get('flow')
        if self.flow == 'many':
            self.custom_settings['CLOSESPIDER_PAGECOUNT'] = 0
        super(LinksExtractor3Spider, self).__init__(*args, **kwargs)

    rules = (
        Rule(LinkExtractor(), callback='parse_item', follow=False),
    )

    def parse_item(self, response):
            item = {}
            item['title'] = response.xpath('//head/title/text()').extract()
            item['url'] = response.url
            yield item

如果我硬编码自定义设置自定义设置{'CLOSESPIDER\u PAGECOUNT':0}

或者通过设置agr:scrapy crawl links\u extractor3-s CLOSESPIDER\u PAGECOUNT=0发送 一切都很完美。你知道吗

是否可以通过args覆盖设置?你知道吗


Tags: selfcomresponseargslinksitemflowbooks