<p>是的,当然可以;)</p>
<p><a href="http://tryolabs.com/Blog/2011/09/27/calling-scrapy-python-script/" rel="nofollow">The idea (inspired from this blog post)</a>将创建一个worker,然后在您自己的Python脚本中使用它:</p>
<pre><code>from scrapy import project, signals
from scrapy.conf import settings
from scrapy.crawler import CrawlerProcess
from scrapy.xlib.pydispatch import dispatcher
from multiprocessing.queues import Queue
import multiprocessing
class CrawlerWorker(multiprocessing.Process):
def __init__(self, spider, result_queue):
multiprocessing.Process.__init__(self)
self.result_queue = result_queue
self.crawler = CrawlerProcess(settings)
if not hasattr(project, 'crawler'):
self.crawler.install()
self.crawler.configure()
self.items = []
self.spider = spider
dispatcher.connect(self._item_passed, signals.item_passed)
def _item_passed(self, item):
self.items.append(item)
def run(self):
self.crawler.crawl(self.spider)
self.crawler.start()
self.crawler.stop()
self.result_queue.put(self.items)
</code></pre>
<p>使用示例:</p>
^{pr2}$
<p>另一种方法是使用<code>system()</code>执行scrapy crawl命令</p>