擅长:python、mysql、java
<p>如果需要使用规则来执行此操作,则可以通过提供<code>process_request</code>回调来修改生成的请求。以下是总结:</p>
<pre><code>class IcrawlerSpider(CrawlSpider):
def __init__(self, *args, **kwargs):
# ...
IcrawlerSpider.rules = [
Rule(LinkExtractor(unique=True), callback='parse_item', process_request='add_meta'),
]
def add_meta(self, request):
request.meta['handle_httpstatus_all'] = True
return request
</code></pre>
<p>引用<a href="https://docs.scrapy.org/en/latest/topics/spiders.html#crawling-rules" rel="nofollow noreferrer">documentation</a>和<a href="https://github.com/scrapy/scrapy/issues/2820#issuecomment-313986161" rel="nofollow noreferrer">example</a>。在</p>