帮助获取和分析带有请求、lxml和美化组4的页面上的文本

parse-helper的Python项目详细描述


安装

lxml

安装系统要求
% sudo apt-get install -y libxml2 libxslt1.1 libxml2-dev libxslt1-dev zlib1g-dev

or

% brew install libxml2

使用pip

安装
% pip3 install parse-helper

用法

ph-ddgph-download-filesph-download-file-asph-soup-explore提供脚本

$ venv/bin/ph-ddg --help
Usage: ph-ddg [OPTIONS] [QUERY]

  Pass a search query to duckduckgo api

Options:
  --help  Show this message and exit.

$ venv/bin/ph-download-files --help
Usage: ph-download-files [OPTIONS] [ARGS]...

  Download all links to local files

  - args: urls or filenames containing urls

Options:
  --help  Show this message and exit.

$ venv/bin/ph-download-file-as --help
Usage: ph-download-file-as [OPTIONS] URL [LOCALFILE]

  Download link to local file

  - url: a string - localfile: a string

Options:
  --help  Show this message and exit.

$ venv/bin/ph-soup-explore --help
Usage: ph-soup-explore [OPTIONS] [URL_OR_FILE]

  Create a soup object from a url or file and explore with ipython

Options:
  --help  Show this message and exit.
In[1]:importparse_helperasphIn[2]:ph.USER_AGENTOut[2]:'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/58.0.3029.110 Chrome/58.0.3029.110 Safari/537.36'In[3]:ph.duckduckgo_api('adventure time')2019-08-2706:21:05,303:FetchingJSONfromhttps://api.duckduckgo.com?q=adventure+time&format=jsonOut[3]:[{'text':'Adventure Time An American animated television series created by Pendleton Ward for Cartoon Network.','thumbnail':'https://duckduckgo.com/i/fb8f17fd.png','link':'https://duckduckgo.com/Adventure_Time'},{'text':'"Adventure Time" (pilot) An animated short created by Pendleton Ward, as well as the pilot to the Cartoon Network series...','thumbnail':'https://duckduckgo.com/i/aa9b49e0.png','link':'https://duckduckgo.com/Adventure_Time_(pilot)'},{'text':"Adventure Time (1959 TV series) A local children's television show on WTAE-TV 4 in Pittsburgh, Pennsylvania, from 1959 to 1975.",'thumbnail':'','link':'https://duckduckgo.com/Adventure_Time_(1959_TV_series)'},{'text':"Adventure Time (1967 TV series) A Canadian children's adventure television series which aired on CBC Television in 1967 and 1968.",'thumbnail':'','link':'https://duckduckgo.com/Adventure_Time_(1967_TV_series)'},{'text':'Adventure Time (album) The second album for the rock/pop trio The Elvis Brothers.','thumbnail':'','link':'https://duckduckgo.com/Adventure_Time_(album)'}]

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
具有作为接口的属性的java Hibernate实体类   在Java中检查int l,r的条件l+1<r的最快方法   java如何更新TornadFX ComboBox Kotlin   java Tomcat未调用控制器api   java在Android Studio中的alertdialog中打开新活动   xml VScode Java/Maven环境问题不同的计算机   java我需要修改循环中的一个文本字符串,这样程序就可以复制粘贴相同的字符串,但数字会增加   java如何从Twilio响应消息中获取内容   从Java5+diamond运算符开始初始化泛型集合的java方法   在java中循环,直到用户按下enter键   java如何找到组件属于哪个面板?   java我想计算一个代码需要的总迭代次数   <data 安卓:type=“*/*”/>不允许使用java字符串类型   解析我有一个带有开始日期和目标日期的字符串,我想在Java中获取日期并保存在变量中   在mongodb中使用ReflectionDBObject类插入java对象?