在GoodReads上搜寻书籍和引用。

scrapereads的Python项目详细描述


刮擦好读物

从中获取数据的Python包goodreads.com网站网站。作者、书籍和引文都可以提取。在

项目是在一个使用GPT2模型生成诗歌的深度学习项目中进行的。

安装

PyPi安装scrapreads包:

pip install scrapereads

或来自GitHub

^{pr2}$

入门

GoodReads API公司

您可以从API中搜索AuthorBook或{}:

fromscrapereadsimportGoodReads# Connect to the APIgoodreads=GoodReads()# Search for an author, from it's ID.AUTHOR_ID=3389author=goodreads.search_author(AUTHOR_ID)# Search for a bookBOOK_ID=3048970book=goodreads.search_book(AUTHOR_ID,BOOK_ID)# Look for the 10 first books (set it to ``top_k=None`` to turn it off)books=goodreads.search_books(AUTHOR_ID,top_k=10)# ...Or quotesquotes=goodreads.search_quotes(AUTHOR_ID,top_k=5)

引号是由文本组成的,但是可以添加可选信息(如喜欢的数量、标签, 参考等)

quotes=goodreads.search_quotes(AUTHOR_ID,top_k=5)forquoteinquotes:print(quote)print()

输出:

"Books are a uniquely portable magic."- Stephen King, from "On Writing: A Memoir Of The Craft"  Likes: 16225, Tags: books, magic, reading"If you don't have time to read, you don't have the time (or the tools) to write. Simple as that."- Stephen King  Likes: 12565, Tags: reading, writing"Get busy living or get busy dying."- Stephen King, from "Different Seasons"  Likes: 9014, Tags: life"Books are the perfect entertainment: no commercials, no batteries, hours of enjoyment for each dollar spent. What I wonder is why everybody doesn't carry a book around for those inevitable dead spots in life."- Stephen King  Likes: 8667, Tags: books"When his life was ruined, his family killed, his farm destroyed, Job knelt down on the ground and yelled up to the heavens, "Why god? Why me?" and the thundering voice of God answered, There's just something about you that pisses me off."- Stephen King, from "Storm Of The Century"  Likes: 7686, Tags: god, humor, religion

结构

包装划分如下:

  • 作者
  • 书,继承自作者
  • 引用,从书中继承

检索数据

一旦您有了这些对象之一,您还可以通过它们的方法直接访问数据:

author=goodreads.search_author(AUTHOR_ID)books=author.get_books()quotes=author.get_quotes()# Idem from an bookbook=goodreads.search_book(AUTHOR_ID,BOOK_ID)quotes=book.get_quotes()

此外,还可以从子对象中检索父对象:

author=goodreads.search_author(AUTHOR_ID)quotes=author.get_quotes(top_k=10)quote=quotes[0]# Access to parent classesbook=quote.get_book()author=quote.get_author()

您可以从中获取描述、链接和其他详细信息:

author=goodreads.search_author(AUTHOR_ID)info=author.get_info()# description of the author (genre, description, references etc.)

最后,您可以使用以下方法从某个作者检索类似的作者:

author=goodreads.search_author(AUTHOR_ID)authors=author.get_similar_authors(top_k=5)

保存并导出

您可以以JSON格式保存数据(如果需要,可以将其编码为ASCII)。在

author=goodreads.search_author(AUTHOR_ID)author_data=author.to_json(encode='ascii')# Idem for book and quote

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java不兼容类型:MainActivity无法转换为LifecycleOwner   java安卓是一种更有效的读取大文本文件的方法   java导出LWJGL本地人与项目?(IntelliJ IDEA)   JDK更新后,JavaJShell不再在下一行打印输出   父类对象上的继承Java比较子属性   Java:有没有一个容器可以有效地结合HashMap和ArrayList?   安卓 Java对象指针   java在annotationdriven Spring MVC应用程序中实现大气   java 安卓源代码构建应用找不到安卓supportv4。罐子   文件系统上的抽象层和Java中的jar/zip   java在水平滚动视图中添加多个图像?   java如何从firebase实时数据库中获取字符串数组   WIndows 10工作站上的java未满足链接错误   java命令在终端中工作,但在使用过程中出现“无结束引号”错误。执行官