找不到Python scrapy spider KeyE

2024-10-02 08:24:43 发布

您现在位置:Python中文网/ 问答频道 /正文

几天来,我一直试图在Scrapy中创建一个爬虫程序,每个项目我都会遇到相同的错误:spider not found。无论我做了什么更改或遵循哪个教程,它总是返回相同的错误。在

有人能告诉我应该在哪里查找错误吗?在

谢谢你!在

Windows 10、python 2.7

C:.
│   scrapy.cfg
│
└───scrapscrapy
    │   items.py
    │   middlewares.py
    │   pipelines.py
    │   settings.py
    │   settings.pyc
    │   __init__.py
    │   __init__.pyc
    │
    └───spiders
            SSSpider.py
            SSSpider.pyc

项目.py

^{pr2}$

SSSpider.py

from scrapy.selector import Selector
from scrapy.spider import Spider
from Scrapscrapy.items import ScrapscrapyItem

class ScrapscrapySpider(Spider):
    name="ss"
    allowed_domains = ["yellowpages.md/rom/companies/info/2683-intelsmdv-srl"]
    start_url = ['http://yellowpages.md/rom/companies/info/2683-intelsmdv-srl/']

    def parse(self, response) : 
        sel = Selector (response)
        item = ScrapscrapyItem()
        item['Heading']=sel.xpath('/html/body/div[2]/div[2]/div/div/div/div/div[1]/div/div[2]/div/article/div/div[1]/div[2]/h2').extract
        item['Content']=sel.xpath('/html/body/div[2]/div[2]/div/div/div/div/div[1]/div/div[2]/div/article/div/div[1]/div[2]/div[2]/div/div[2]/div/div[1]/div[1]').extract
        item['Source_Website']= 'yellowpages.md/rom/companies/info/2683-intelsmdv-srl'
        return item

设置

BOT_NAME = 'scrapscrapy'

SPIDER_MODULES = ['scrapscrapy.spiders']
NEWSPIDER_MODULE = 'scrapscrapy.spiders'


# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'scrapscrapy (+http://www.yourdomain.com)'

# Obey robots.txt rules
ROBOTSTXT_OBEY = True

命令行:

C:\Users\nastea\Desktop\scrapscrapy>scrapy crawl ss
c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\spiderloader.py:37: RuntimeWarning:
Traceback (most recent call last):
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\spiderloader.py", line 31, in _load_all_spiders
    for module in walk_modules(name):
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\utils\misc.py", line 63, in walk_modules
    mod = import_module(path)
  File "c:\python27\lib\importlib\__init__.py", line 37, in import_module
    __import__(name)
ImportError: No module named spiders
Could not load spiders from module 'scrapscrapy.spiders'. Check SPIDER_MODULES setting
  warnings.warn(msg, RuntimeWarning)
2017-02-19 14:21:16 [scrapy.utils.log] INFO: Scrapy 1.3.2 started (bot: scrapscrapy)
2017-02-19 14:21:16 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'scrapscrapy.spiders', 'SPIDER_MODULES': ['scrapscrapy.spiders'], 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'scrapscrapy'}
Traceback (most recent call last):
  File "c:\python27\Scripts\scrapy-script.py", line 11, in <module>
    load_entry_point('scrapy==1.3.2', 'console_scripts', 'scrapy')()
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\cmdline.py", line 142, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\cmdline.py", line 88, in _run_print_help
    func(*a, **kw)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\cmdline.py", line 149, in _run_command
    cmd.run(args, opts)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\commands\crawl.py", line 57, in run
    self.crawler_process.crawl(spname, **opts.spargs)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\crawler.py", line 162, in crawl
    crawler = self.create_crawler(crawler_or_spidercls)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\crawler.py", line 190, in create_crawler
    return self._create_crawler(crawler_or_spidercls)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\crawler.py", line 194, in _create_crawler
    spidercls = self.spider_loader.load(spidercls)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\spiderloader.py", line 51, in load
    raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: ss'

编辑

正如eLRuLL建议的那样,我在spider文件夹中添加了_init_.py文件,也进行了更改蜘蛛屑到痒蜘蛛就像它告诉我的那样。现在cmd返回的结果是:

C:\Users\nastea\Desktop\scrapscrapy>scrapy crawl ss
c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\spiderloader.py:37: RuntimeWarning:
Traceback (most recent call last):
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\spiderloader.py", line 31, in _load_all_spiders
    for module in walk_modules(name):
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\utils\misc.py", line 71, in walk_modules
    submod = import_module(fullpath)
  File "c:\python27\lib\importlib\__init__.py", line 37, in import_module
    __import__(name)
  File "C:\Users\nastea\Desktop\scrapscrapy\scrapscrapy\spiders\SSSpider.py", line 3, in <module>
    from Scrapscrapy.items import ScrapscrapyItem
ImportError: No module named Scrapscrapy.items
Could not load spiders from module 'scrapscrapy.spiders'. Check SPIDER_MODULES setting
  warnings.warn(msg, RuntimeWarning)
2017-02-19 15:13:36 [scrapy.utils.log] INFO: Scrapy 1.3.2 started (bot: scrapscrapy)
2017-02-19 15:13:36 [scrapy.utils.log] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'scrapscrapy.spiders', 'SPIDER_MODULES': ['scrapscrapy.spiders'], 'ROBOTSTXT_OBEY': True, 'BOT_NAME': 'scrapscrapy'}
Traceback (most recent call last):
  File "c:\python27\Scripts\scrapy-script.py", line 11, in <module>
    load_entry_point('scrapy==1.3.2', 'console_scripts', 'scrapy')()
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\cmdline.py", line 142, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\cmdline.py", line 88, in _run_print_help
    func(*a, **kw)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\cmdline.py", line 149, in _run_command
    cmd.run(args, opts)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\commands\crawl.py", line 57, in run
    self.crawler_process.crawl(spname, **opts.spargs)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\crawler.py", line 162, in crawl
    crawler = self.create_crawler(crawler_or_spidercls)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\crawler.py", line 190, in create_crawler
    return self._create_crawler(crawler_or_spidercls)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\crawler.py", line 194, in _create_crawler
    spidercls = self.spider_loader.load(spidercls)
  File "c:\python27\lib\site-packages\scrapy-1.3.2-py2.7.egg\scrapy\spiderloader.py", line 51, in load
    raise KeyError("Spider not found: {}".format(spider_name))
KeyError: 'Spider not found: ss'

Tags: inpydivegglibpackageslinesite

热门问题