Scrapy KeyError:'项目不支持字段：url'

2024-10-01 09:39:26 发布

您现在位置：Python中文网/ 问答频道 /正文

10557

网友

男 | 程序猿一只，喜欢编程写python代码。

我在学做蜘蛛，一直在想办法找出这个小虫子。任何帮助都将不胜感激。谢谢。在

当我运行我的蜘蛛时，我收到这样一个错误：

KeyError: 'SoapguildItem does not support field: url'

以下是我一直在研究的代码：

# -*- coding: utf-8 -*-
import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule

from soapguild.items import SoapguildItem

class SoapySpider(CrawlSpider):
name = 'soapy'
allowed_domains = ['soapguild.org']
start_urls = ['http://www.soapguild.org/']

rules = (
    Rule(LinkExtractor(), callback='parse_item', follow=True),
)

def parse_item(self, response):
    href = SoapguildItem()
    href['url'] = response.url
    # Email
    email = response.xpath("//div/div[1]/p[2]/a[1]/@href").extract()
    email = email.replace("mailto:", "")
    #email = email.replace("(at)". "@")
    location = response.xpath("//div/div[1]/p[1]/text()[2]").extract()
    #location
    location = response.xpath("//div/div[1]/p[1]/text()[2]").extract()
    #contact
    contact = response.xpath("//div/div[1]/p[2]/text()[1]").extract() 
    contact = contact.replace("Contact: ", "")
    #website 
    website = response.xpath("//div/div[1]/p[2]/a[2]//@href").extract()

    for item in zip(email,location,contact,website):
        scraped_info = {
            'Email' : item[0],
            'Location' : item[1],
            'Contact' : item[2],
            'Website' : item[3]
        }

        yield scraped_info

Tags： from import div url email response contact extract

2条回答

网友

1楼 · 编辑于 2024-10-01 09:39:26

项文件“SoapguildItem”不包含名为成员变量的url，请定义url。在

from scrapy.item import Item, Field


class SoapguildItem(Item):
    url = Field()

网友

2楼 · 编辑于 2024-10-01 09:39:26

你在中添加了url作为字段吗items.py？我认为错误来自于：href['url']

Scrapy KeyError:'项目不支持字段：url'

相关问题更多 >

编程相关推荐

热门问题

热门文章

Scrapy KeyError:'项目不支持字段：url'

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >