在完全执行之前结束scrapy runspider是don

class PythonEventsSpider(scrapy.Spider): name = 'goodspider' start_urls=['https://www.amazon.com/s?me=A33IZBYF4IBZTP&marketplaceID=ATVPDKIKX0DER'] details=[] def parse(self, response): base_url="https://www.amazon.com" #code here next_page=base_url+response.xpath('//li[@class="a-last"]/a/@href').extract_first() print(next_page) if "page=3" not in next_page: yield scrapy.Request(url=next_page,callback=self.parse) else: #raise CloseSpider('bandwidth_exceeded') #exit("Done")

2条回答

网友

1楼 · 编辑于 2024-10-05 14:23:43

如果您真的希望脚本在该点完全停止，那么可以终止脚本，就像终止任何其他Python脚本一样：use ^{}。你知道吗

然而，这意味着Scrapy的项目处理和内部工作的其他部分将没有机会运行。如果这对你来说是个问题，除了乌迈尔的回应之外，没有别的办法了。你知道吗

网友

2楼 · 编辑于 2024-10-05 14:23:43

CloseSpider也将处理所有挂起的请求

所以你必须设置CONCURRENT_REQUESTS=1

相关问题更多 >

编程相关推荐

热门问题

热门文章