ruia_pyppeteer-用于加载javascript-pyppeteer的ruia插件。
ruia-pyppeteer的Python项目详细描述
ruia pyppeteer
用于加载javascript的Ruia插件
Notice: Works on ruia >= 0.4.9
安装
pip install ruia_pyppeteer
# New features
pip install git+https://github.com/ruia-plugins/ruia-pyppeteer
使用量
ruia_pyppeteer
将使用pyppeteer加载js。
你需要注意的是,当你使用load_js时,它会下载最新版本的chromium(~100MB)。这种情况只有一次。
加载javascript
importasynciofromruia_pyppeteerimportPyppeteerRequestasRequestrequest=Request("https://www.jianshu.com/",load_js=True)response=asyncio.get_event_loop().run_until_complete(request.fetch())print(response)
完整示例
fromruiaimportAttrField,TextField,Itemfromruia_pyppeteerimportPyppeteerSpiderasSpiderclassJianshuItem(Item):target_item=TextField(css_select='ul.list>li')author_name=TextField(css_select='a.name')author_url=AttrField(attr='href',css_select='a.name')asyncdefclean_author_url(self,author_url):returnf"https://www.jianshu.com{author_url}"classJianshuSpider(Spider):start_urls=['https://www.jianshu.com/']concurrency=10asyncdefparse(self,response):asyncforiteminJianshuItem.get_items(html=response.html):# Loading js by using PyppeteerRequestprint(item)if__name__=='__main__':JianshuSpider.start()
享受吧:)