错误响应“NoneType”对象在使用scrapy抓取网站时不可编辑

import scrapy class QuotesSpider(scrapy.Spider): name = "quotes" def start_requests(self): urls = [ 'http://www.narakkalkuries.com/intimation.html#i' ] def parse(self, response): for check in response.xpath('//table[@class="MsoTableGrid"]'): yield{ 'data':check.xpath('//table[@class="MsoTableGrid"]/tr/td/p/b//text()').extract_first() }

2条回答

网友

1楼 · 编辑于 2024-09-29 21:57:15

要添加到这一点，start_requests应该是scrapy.Request对象的生成器。您的start_requests没有产生任何结果：

def start_requests(self):
    urls = [
       'http://www.narakkalkuries.com/intimation.html#i'
    ]

要修复在start_requests方法中逐个生成URL的问题：

^{pr2}$

或者只需设置start_urlsclass属性，就可以使用从scrapy.Spider继承的默认start_requests方法：

import scrapy
class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        'http://www.narakkalkuries.com/intimation.html#i'
    ]

网友

2楼 · 编辑于 2024-09-29 21:57:15

import scrapy


class QuotesSpider(scrapy.Spider):
  name = "quotes"

def start_requests(self):
    urls = [
       'http://www.narakkalkuries.com/intimation.html#i'
    ]

    # Here you need to yield the scrapy.Request
    for url in urls:
        yield scrapy.Request(url)

def parse(self, response):
  for check in response.xpath('//table[@class="MsoTableGrid"]'):
    yield{
           'data':check.xpath('//table[@class="MsoTableGrid"]/tr/td/p/b//text()').extract_first()
         }

相关问题更多 >

编程相关推荐

热门问题

热门文章