擅长:python、mysql、java
<p>你想写的是<a href="http://doc.scrapy.org/en/latest/topics/downloader-middleware.html" rel="nofollow">downloader middleware component</a>。你问它是否有可能“改变请求处理”;它的介绍说它是一个“全球改变Scrapy的请求和响应的系统”;我不知道为什么你不会认为这是你要找的,但如果你继续读下去,这正是它听起来的样子。你知道吗</p>
<p><code>DownloaderMiddleware</code>对象中的关键方法是<code>process_request</code>。正如医生所说:</p>
<blockquote>
<p>This method is called for each request that goes through the download middleware.</p>
<p><code>process_request()</code> should either: <code>return None</code>, return a <code>Response</code> object, return a <code>Request</code> object, or raise <code>IgnoreRequest</code>.</p>
<p>…</p>
<p>If it returns a <code>Response</code> object, Scrapy won’t bother calling any other <code>process_request()</code> or <code>process_exception()</code> methods, or the appropriate download function; it’ll return that response.</p>
</blockquote>
<p>所以,您只需编写一个<code>DownloaderMiddleware</code>,它的<code>process_request</code>调用Selenium,处理它返回的内容,并将其包装在<code>Response</code>中返回。你知道吗</p>
<p>如果不明显,内置的<code>HttpCacheMiddleware</code>应该演示如何做到这一点。你知道吗</p>