如何在scrapy函数中运行这两项?

2024-10-05 17:34:56 发布

您现在位置:Python中文网/ 问答频道 /正文

每当我在start_urls变量中使用captions and transcription的链接时,它会在标题和转录变量中给出caption的价格,并再次在这两个变量中给出transcription的价格。为什么以及如何解决这个问题

import scrapy
from .. items import FetchingItem

class SiteFetching(scrapy.Spider):
    name = 'Site'
    start_urls = ['https://www.rev.com/freelancers/captions',
                  'https://www.rev.com/freelancers/transcription']

    def parse(self, response):
        items = FetchingItem()
        Transcription_price = response.css('#middle-benefit .mt1::text').extract()
        Caption_price = response.css('#middle-benefit .mt1::text').extract()

        items['Transcription_price'] = Transcription_price
        items['Caption_price'] = Caption_price
        yield items

Tags: httpsimportresponsewwwitems价格urlsstart
1条回答
网友
1楼 · 发布于 2024-10-05 17:34:56

我怀疑您需要另一种类结构,顺序:

import scrapy
from .. items import FetchingItem

class SiteFetching(scrapy.Spider):
    name = 'Site'
    start_urls = ['https://www.rev.com/freelancers/captions']

    def parse(self, response):
        items = FetchingItem()
        items['Caption_price'] = response.css('#middle-benefit .mt1::text').extract()
        yield Request('https://www.rev.com/freelancers/transcription', self.parse_transcription, meta={'items': items})

    def parse_transcription(self, response):
        items = response.meta['items']
        items['Transcription_price'] = response.css('#middle-benefit .mt1::text').extract()
        yield items

相关问题 更多 >