Python scrapy-puppeteer包_程序模块 - PyPI

木偶戏

scrapy-puppeteer的Python项目详细描述

#与木偶师嬉戏
[！[PYPI]（https://img.shields.io/pypi/v/scrapy puppeter.svg）（https://pypi.python.org/pypi/scrapy-puppeter）[！[构建状态]（https://travis-ci.org/clemfromspace/scrapy-puppeter.svg？branch=master）（https://travis-ci.org/clemfromspace/scrapy-puppeter）[！[测试覆盖率]（https://api.codeculate.com/v1/badges/86603b736e684dd4f8c9/test-coverage）（https://codeculate.com/github/clemfromspace/scrapy-puppeter/test-coverage）[！[可维护性]（https://api.codeculate.com/v1/badges/86603b736e684dd4f8c9/维修性）（https://codeculate.com/github/clemfromspace/scrapy puppeter/维修性）

scrapy中间件使用[puppeter]处理javascript页面（https://github.com/GoogleChrome/puppeter）。

这是让scrapy和puppeter协同工作来处理javascript呈现的页面的尝试。
设计灵感来自scrapy[splash plugin]（https://github.com/scrapy plugins/scrapy splash）。

**scrapy和puppeter**

使用[Twisted]（https://twistedmatrix.com/trac/）和[pyppeteer]（https://miyakogi.github.io/pyppeteer/）（我们正在使用的puppeter的python端口）使用[asyncio]（https://docs.python.org/3/library/asyncio.html）进行异步操作

幸运的是，我们可以使用Twisted的[异步反应器]（https://twistedmatrix.com/documents/18.4.0/api/twisted.internet.asyncio reactor.html）让两人互相交谈。

由该模块提供。

在导入scrapy或执行其他操作之前，您必须确保安装了异步反应器：

``python
import asyncio
from twisted.internet import asyncio reactor

asyncioreactor.install（asyncio.get_event_loop（））
````

狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗狗像下面：
``python
from scrapy_puppeter import puppeterrequest

def your_parse_method（self，response）：
your code…
yield puppeterrequest（'http://httpbin.org'，self.parse_result）
```
然后请求将由puppeter处理。

`selector`响应属性照常工作（但包含puppeter处理的html）。

``python
`def parse_result（self，响应：
print（response.selector.xpath（'///title/@text'））
`````

`scrapy_puppeteer.puppeteerrequest `接受2个附加参数：

` wait `（https://br/>

/>br/>`>将被传递给[`waituntiuntiunt `（waituntiuntiuntiun（u modules/pyppeteer/page.html page.goto）参数木偶制作者。
默认为“domcontentloaded”。

highlight=image pyppeteer.page.page.waitfor）发送给木偶师。

`截图`
使用时，木偶师将拍摄[截图]（https://miyakogi.github.io/pyppeteer/reference.html？highlight=headers#pyppeteer.page.page.screenshot）页面和捕获的.png的二进制数据将添加到响应“meta”：
``python
yield puppeterrequest（
url，
self.parse懔result，
screenshot=True

def parse懔result（self，response）：
打开（'image.png'“wb”）作为图像文件：
image\u file.write（response.meta['screenshot']）
'

欢迎加入QQ群-->： 979659372

scrapy-puppeteer 0.0.1b0

scrapy-puppeteer的Python项目详细描述

推荐PyPI第三方库

entity-fishing-client

spacytransformers

neo3vm-stubs

GBdistributions-ML

pyobjcframeworkfindersync

inlinec

sanicopenapi

xkci-cli

zilong

itksegmentation

eprints2archives

pyGRNN

eddiefroufrou-test123

celeryhaystack

notefight

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

scrapy-puppeteer 0.0.1b0

scrapy-puppeteer的Python项目详细描述

推荐PyPI第三方库

entity-fishing-client

spacytransformers

neo3vm-stubs

GBdistributions-ML

pyobjcframeworkfindersync

inlinec

sanicopenapi

xkci-cli

zilong

itksegmentation

eprints2archives

pyGRNN

eddiefroufrou-test123

celeryhaystack

notefight

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签