Scrapy ValueError:url不能为无

导言

我必须创建一个爬行器，它可以爬行https://www.karton.eu/einwellig-ab-100-mm的信息和产品的重量，在跟随productlink到它自己的页面后，可以将其刮取

运行代码后，我收到以下错误消息：

我已经检查了url是否损坏，所以在我的粘壳中我可以获取它

使用的代码是：

import scrapy from ..items import KartonageItem class KartonSpider(scrapy.Spider): name = "kartons" allow_domains = ['karton.eu'] start_urls = [ 'https://www.karton.eu/einwellig-ab-100-mm' ] custom_settings = {'FEED_EXPORT_FIELDS': ['SKU', 'Title', 'Link', 'Price', 'Delivery_Status', 'Weight'] } def parse(self, response): card = response.xpath('//div[@class="text-center artikelbox"]') for a in card: items = KartonageItem() link = a.xpath('@href') items ['SKU'] = a.xpath('.//div[@class="signal_image status-2"]/small/text()').get() items ['Title'] = a.xpath('.//div[@class="title"]/a/text()').get() items ['Link'] = link.get() items ['Price'] = a.xpath('.//div[@class="price_wrapper"]/strong/span/text()').get() items ['Delivery_Status'] = a.xpath('.//div[@class="signal_image status-2"]/small/text()').get() yield response.follow(url=link.get(),callback=self.parse, meta={'items':items}) def parse_item(self,response): table = response.xpath('//span[@class="staffelpreise-small"]') items = KartonageItem() items = response.meta['items'] items['Weight'] = response.xpath('//span[@class="staffelpreise-small"]/text()').get() yield items

是什么导致了这个错误

1条回答

网友

1楼 · 发布于 2024-09-30 06:24:13

问题是link.get()返回一个None值。问题似乎出在XPath中

def parse(self, response):
    card = response.xpath('//div[@class="text-center artikelbox"]')

    for a in card:
        items = KartonageItem()
        link = a.xpath('@href')

虽然card变量选择一些div标记，但该div的自轴中没有@href（这就是它返回空的原因），但在子代a标记中有。所以我相信这会给你带来预期的结果：

def parse(self, response):
    card = response.xpath('//div[@class="text-center artikelbox"]')

    for a in card:
        items = KartonageItem()
        link = a.xpath('a/@href') # FIX HERE <<<<<

导言

相关问题更多 >

编程相关推荐

热门问题

热门文章