<p>你的spider对我来说很好,使用Scrapy 1.1、splash2.1,问题中没有修改代码,只使用<a href="https://github.com/scrapy-plugins/scrapy-splash" rel="nofollow">https://github.com/scrapy-plugins/scrapy-splash</a>中建议的设置</p>
<p>正如其他人所提到的,您的<code>parse</code>函数可以通过直接使用<code>response.css()</code>和{<cd3>}来简化,而不需要从响应中重新构建<code>Selector</code>。在</p>
<p>我试过:</p>
<pre><code>import scrapy
from scrapy.selector import Selector
from scrapy_splash import SplashRequest
class CartierSpider(scrapy.Spider):
name = 'cartier'
start_urls = ['http://www.cartier.co.uk/en-gb/collections/watches/mens-watches/ballon-bleu-de-cartier/w69017z4-ballon-bleu-de-cartier-watch.html']
def start_requests(self):
for url in self.start_urls:
yield SplashRequest(url, self.parse, args={'wait': 0.5})
def parse(self, response):
yield {
'title': response.xpath('//title/text()').extract_first(),
'link': response.url,
'productID': response.xpath('//span[@itemprop="productID"]/text()').extract_first(),
'model': response.xpath('//span[@itemprop="model"]/text()').extract_first(),
'price': response.css('div.price-wrapper').xpath('.//span[@itemprop="price"]/text()').extract_first(),
}
</code></pre>
<p>得到了这个:</p>
^{pr2}$