使用Python Scrapy库时,Response.css在获取div数据时显示空列表?

2024-06-26 00:05:49 发布

您现在位置:Python中文网/ 问答频道 /正文

scrapy shell 'https://www.blibli.com/promosi/samsung-mobilephones-tablet?appsWebview=true'  
fetch('https://www.blibli.com/promosi/samsung-mobilephones-tablet?appsWebview=true')
response.css('div.productset-carousel-mobile__block-item item')
[]

Description: I'm trying to fetch name, price of products mentioned in the url. As to get the raw data of the div class = 'productset-carousel-mobile__block-item item'. I'm writing response.css('div.productset-carousel-mobile__block-item item') But every time it gives empty list or goes to next line of terminal.

Now I Don't know where I'm wrong. Right now I learning scrapy from a youtube tutorial.

All suggestions and links to refer for clearing this concept are warmly accepted.


Tags: ofthetohttpsdivcomwwwmobile
1条回答
网友
1楼 · 发布于 2024-06-26 00:05:49

该站点的内容是动态的,因此您不能使用xhr访问它们。但是,有一个api包含与您所追求的内容相同的内容。以下是如何从登录页中刮取产品名称及其所属类别。请随意添加其他相关字段

import scrapy

class BliBliSpider(scrapy.Spider):
    name = 'blibli'
    start_urls = ['https://www.blibli.com/backend/content/promotions/samsung-mobilephones-tablet']

    def parse(self, response):
        for item in response.json()['data']['components']:
            if not item['name']=='PRODUCT_CAROUSEL':continue
            for container in item['parameters']:
                cat_name = container['title']
                for product in container['products']:
                    yield {"category":cat_name,"product name":product['name']}

相关问题 更多 >