如何在嵌套URL scrape中传递单个链接？问题的回答

如何在嵌套URL scrape中传递单个链接？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我有一个问题，我刮一个子页面与链接，我在主页上获得。你知道吗 每个漫画都有自己的页面，所以我试着打开每一个项目的页面，并刮价。你知道吗 这是蜘蛛： <pre class="lang-py prettyprint-override"><code>class PaniniSpider(scrapy.Spider): name = "spiderP" start_urls = ["http://comics.panini.it/store/pub_ita_it/magazines.html"] def parse(self, response): # Get all the <a> tags for sel in response.xpath("//div[@class='list-group']//h3/a"): l = ItemLoader(item=ComicscraperItem(), selector=sel) l.add_xpath('title', './text()') l.add_xpath('link', './@href') request = scrapy.Request(sel.xpath('./@href').extract_first(), callback=self.parse_isbn, dont_filter=True) request.meta['l'] = l yield request def parse_isbn(self, response): l = response.meta['l'] l.add_xpath('price', "//p[@class='special-price']//span/text()") return l.load_item() </code></pre> 问题是关于链接，输出类似于： <pre class="lang-sh prettyprint-override"><code>{"title": "Spider-Man 14", "link": ["http://comics.panini.it/store/pub_ita_it/mmmsm014isbn-it-marvel-masterworks-spider-man-marvel-masterworks-spider.html"], "price": ["\n \u20ac\u00a022,50 ", "\n \u20ac\u00a076,50 ", "\n \u20ac\u00a022,50 ", "\n \u20ac\u00a022,50 ", "\n \u20ac\u00a022,50 ", "\n \u20ac\u00a018,00 {"title": "Avenger di John Byrne", "link": ["http://comics.panini.it/store/pub_ita_it/momae005isbn-it-omnibus-avengers-epic-collecti-marvel-omnibus-avengers-by.html"], "price": ["\n \u20ac\u00a022,50 ", "\n \u20ac\u00a076,50 ", "\n \u20ac\u00a022,50 </code></pre> 简而言之，请求传递每个项目的链接列表，因此价格不是唯一的，而是列表的结果。你知道吗 如何只传递相关项目的链接并存储每个项目的价格？你知道吗

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如何在嵌套URL scrape中传递单个链接？

1 个回答

相关Python问题