如何修复“TypeError:无法混合str和nonstr参数”？

# -*- coding: utf-8 -*- import scrapy from myproject.items import Headline class NewsSpider(scrapy.Spider): name = 'IC' allowed_domains = ['kosoku.jp'] start_urls = ['http://kosoku.jp/ic.php'] def parse(self, response): """ extract target urls and combine them with the main domain """ for url in response.css('table a::attr("href")'): yield(scrapy.Request(response.urljoin(url), self.parse_topics)) def parse_topics(self, response): """ pick up necessary information """ item=Headline() item["name"]=response.css("h2#page-name ::text").re(r'.*（インターチェンジ）') item["road"]=response.css("div.ic-basic-info-left div:last-of-type ::text").re(r'.*道$') yield item

2条回答

网友

1楼 · 编辑于 2024-10-01 17:26:37

你得到这个错误是因为第15行的代码。由于response.css('table a::attr("href")')返回类型为list的对象，因此必须首先将url的类型从list转换为{}，然后才能将代码解析为另一个函数。此外，attr语法可能会导致一个错误，因为正确的attr标记没有""，所以不是a::attr("href")，而是a::attr(href)。在

因此，在消除上述两个问题后，代码将如下所示：

def parse(self, response):
        """
        extract target urls and combine them with the main domain
        """

        url = response.css('table a::attr(href)')
        url_str = ''.join(map(str, url))     #coverts list to str
        yield response.follow(url_str, self.parse_topics)

网友

2楼 · 编辑于 2024-10-01 17:26:37

根据Scrapy文档，您使用的.css(selector)方法返回一个SelectorList实例。如果需要url的实际（unicode）字符串版本，请调用extract()方法：

def parse(self, response):
    for url in response.css('table a::attr("href")').extract():
        yield(scrapy.Request(response.urljoin(url), self.parse_topics))

相关问题更多 >

编程相关推荐

热门问题

热门文章