用Scrapy爬行不成功

2024-09-30 22:26:51 发布

男 | 程序猿一只，喜欢编程写python代码。

我试图用scrapy捕捉站点https://www.kalkhoff-bikes.com/的所有产品名称。但结果不如预期。我做错了什么？我的第一次尝试是：

import scrapy

class ToScrapeSpider(scrapy.Spider):
    name = 'Kalkhoff_1'

    start_urls = [
        'https://www.kalkhoff-bikes.com/'
    ]
    allowed_domains = [
        'kalkhoff-bikes.com'
    ]

    def parse(self, response):
        for item in response.css('ul.navMain__subList--sub > li.navMain__subItem'):
            yield {
                    'Name': item.css("span.navMain__subText::text").get()
                   }

        for href in response.css('li.navMain__item a::attr(href)'):
            yield response.follow(href, self.parse)

之后，我读到，如果有一个动态的内容，那么解决方案应该是飞溅。所以我试了这个：

import scrapy
from scrapy_splash import SplashRequest

class ToScrapeSpider(scrapy.Spider):
    name = 'Kalkhoff_2'

    start_urls = [
        'https://www.kalkhoff-bikes.com/'
    ]
    allowed_domains = [
        'kalkhoff-bikes.com'
    ]

    def start_requests(self):
        for url in self.start_urls:
            yield SplashRequest(url, self.parse,
                endpoint='render.html',
                args={'wait':0.5},
            )

    def parse(self, response):
        for item in response.css('ul.navMain__subList--sub > li.navMain__subItem'):
            yield {
                    'Name': item.css("span.navMain__subText::text").get()
                   }

        for href in response.css('li.navMain__item a::attr(href)'):
            yield response.follow(href, self.parse)

不幸的是，我没有得到所有的产品名称。我走对了吗

Tags： in self com for parse response item start

0条回答

目前没有回答

用Scrapy爬行不成功

相关问题更多 >

编程相关推荐

热门问题

热门文章

用Scrapy爬行不成功

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >