库提供了对pdf/image的有用操作

pdfutil的Python项目详细描述


PDFUTIL[开发中]

库提供了很多对pdf/image的操作。

输入和输出

libarary用一组为eevry函数固定的标准参数公开每个函数

import pdfutil
coordinates = pdfutil.detect_*(pdf_location, [save_result=False], [show_result=False], [result_location='.'], [args={}])
NameDescription
pdf_locationinput location of PDF, image can also be passed libaray will autodetect the image
save_resultDefault False, If True will save the result pdf/img in location specified by result_location
show_resultDefault False, This is used for debugging only when True will popup a matplotlib plot highlighting the regions which are detected with corresponding labels
result_locationDefault current directory, location where ouptut needs to be saved, ignored if save_result is set as False
argscustom set of args in form of dictionaty specific to each function
coordinatesOutput returned by the function call, this will contain json output in following format
[
  {
    "type": "text",
    "output": {
      "coord": [
        ["pageno_1", "startx_1", "starty_1", "width_1", "height_1"],
        ["pageno_2", "startx_2", "starty_2", "width_2", "height_2"]
      ]
    }
  },
  {
    "type": "table",
    "output": {
      "coord": [
        ["pageno_1", "startx_1", "starty_1", "width_1", "height_1"],
      ]
    }
  }
]

操作

检测表

import pdfutil
coordinates = pdfutil.detect_tables(pdf_location)

检测文本区域[段落/非结构化内容]

import pdfutil
coordinates = pdfutil.detect_text(pdf_location)

检测非文本区域[图像/徽标]

import pdfutil
coordinates = pdfutil.detect_non_text(pdf_location)

检测语言

import pdfutil
coordinates = pdfutil.detect_non_language(pdf_location)

检测键值对

import pdfutil
coordinates = pdfutil.detect_key_value_pairs(pdf_location)

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java从包含的jar中排除模型   java Guava MultiSet vs Map?   java freemarker示例将csv转换为xml   regexjava对条件前瞻的支持   即使在将mysql Jconnector添加到类路径之后,也可以获得java。lang.ClassNotFoundException(不使用IDE)   不使用Java将HSV(Java中的HSB)转换为RGB。awt。颜色(在谷歌应用程序引擎上不允许)   API参数google应用程序引擎(java)   java如何在Mac系统上使用基于windows的弹出窗口   Java语法|=意味着什么   Java:如何在自己压缩后编写图像   oop无法正确运行阈值Java   java文件未找到异常,系统无法指定映像路径