是否在保存前修改提取的数据？

2024-10-03 11:24:25 发布

您现在位置：Python中文网/ 问答频道 /正文

8783

网友

男 | 程序猿一只，喜欢编程写python代码。

我正试图将一个url附加到一个提取的数据块上，但我一辈子都找不到该怎么做

我使用的选择器如下所示：

'15\u urlmod'：response.url.split（'='）[-1]+“\u l\u a1.jpg”

这行代码返回如下内容：

12306116\u l\u a1.jpg

然后我想附加http:exampleurl.com/images/12306116_l_a1.jpg 你知道吗

因此，scrapy要提取并保存的最终url是：

你知道吗http:exampleurl.com/images/12306116_l_a1.jpg 你知道吗

我是Python的新手，已经搜索了好几天试图弄清楚这一点。我使用的spider代码全文如下：

import scrapy
from scrapy.selector import Selector


    #Starting URL to scrape
class examplespiderscraper(scrapy.Spider):
    name = "examplespider"
    start_urls = ['https://www.exampleurl.com']

    def parse(self, response):
        for book_url in response.xpath(
                "//div[@class='s-producttext-top-wrapper']/a//@href").extract():
            yield scrapy.Request(response.urljoin(book_url), callback=self.parse_details)
        next_page = response.css('span.PageNumberInner > a.swipeNextClick::attr(href)').extract_first()
        if next_page:
            yield scrapy.Request(response.urljoin(next_page), callback=self.parse)



    def parse_details(self, response):
        yield {
            '01_brand': response.xpath("//span[@id='lblProductBrand']/text()").extract_first(),
            '15_urlmod': response.url.split('=')[-1] + "_l_a1.jpg",
        }

Tags： self com url parse response a1 page extract

1条回答

网友

1楼 · 发布于 2024-10-03 11:24:25

问题解决了，在玩了一会儿之后，我想到：

“https://sampleurl”+response.url.split（'='）[-1]+“\l\u a8.jpg”

我不知道你能做到

希望这能帮助别人

是否在保存前修改提取的数据？

相关问题更多 >

编程相关推荐

热门问题

热门文章

是否在保存前修改提取的数据？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >