<ol>
<li>在lua脚本中还返回HAR数据(<a href="https://splash.readthedocs.io/en/stable/scripting-ref.html#splash-har" rel="nofollow noreferrer">https://splash.readthedocs.io/en/stable/scripting-ref.html#splash-har</a>):</li>
</ol>
<pre><code> return {
html = splash:html(),
har = splash:har(),
cookies = splash:get_cookies(),
}
</code></pre>
<ol start=“2”>
<li>假设您使用的是scrapy splash(<a href="https://github.com/scrapy-plugins/scrapy-splash" rel="nofollow noreferrer">https://github.com/scrapy-plugins/scrapy-splash</a>),请确保为您的请求设置了<code>execute</code>端点:</li>
</ol>
<p><code>meta['splash']['endpoint'] = 'execute'</code>。在</p>
<p>如果使用<code>scrapy.Request</code>,渲染器.json是默认端点,但对于<code>scrapy_splash.SplashRequest</code>,默认端点是渲染.html. 看看这两个例子,看看如何设置端点:<a href="https://github.com/scrapy-plugins/scrapy-splash#requests" rel="nofollow noreferrer">https://github.com/scrapy-plugins/scrapy-splash#requests</a></p>
<ol start=“3”>
<li>现在,您才有权访问parse方法中的<code>X-Crawlera-Session</code>头:</li>
</ol>
^{pr2}$
<pre><code>>>> headers = json.loads(response.text)['har']['log']['entries'][0]['response']['headers']
>>> next(x for x in headers if x['name'] == 'X-Crawlera-Session')
{u'name': u'X-Crawlera-Session', u'value': u'2124641382'}
</code></pre>