Python pdfutil包_程序模块 - PyPI

库提供了对pdf/image的有用操作

pdfutil的Python项目详细描述

PDFUTIL[开发中]

库提供了很多对pdf/image的操作。

输入和输出

libarary用一组为eevry函数固定的标准参数公开每个函数

import pdfutil
coordinates = pdfutil.detect_*(pdf_location, [save_result=False], [show_result=False], [result_location='.'], [args={}])

Name	Description
pdf_location	input location of PDF, image can also be passed libaray will autodetect the image
save_result	Default False, If True will save the result pdf/img in location specified by result_location
show_result	Default False, This is used for debugging only when True will popup a matplotlib plot highlighting the regions which are detected with corresponding labels
result_location	Default current directory, location where ouptut needs to be saved, ignored if save_result is set as False
args	custom set of args in form of dictionaty specific to each function
coordinates	Output returned by the function call, this will contain json output in following format

[
  {
    "type": "text",
    "output": {
      "coord": [
        ["pageno_1", "startx_1", "starty_1", "width_1", "height_1"],
        ["pageno_2", "startx_2", "starty_2", "width_2", "height_2"]
      ]
    }
  },
  {
    "type": "table",
    "output": {
      "coord": [
        ["pageno_1", "startx_1", "starty_1", "width_1", "height_1"],
      ]
    }
  }
]

操作

检测表

import pdfutil
coordinates = pdfutil.detect_tables(pdf_location)

检测文本区域[段落/非结构化内容]

import pdfutil
coordinates = pdfutil.detect_text(pdf_location)

检测非文本区域[图像/徽标]

import pdfutil
coordinates = pdfutil.detect_non_text(pdf_location)

检测语言

import pdfutil
coordinates = pdfutil.detect_non_language(pdf_location)

检测键值对

import pdfutil
coordinates = pdfutil.detect_key_value_pairs(pdf_location)

欢迎加入QQ群-->： 979659372

pdfutil 0.0.1

pdfutil的Python项目详细描述

PDFUTIL[开发中]

输入和输出

操作

检测表

检测文本区域[段落/非结构化内容]

检测非文本区域[图像/徽标]

检测语言

检测键值对

推荐PyPI第三方库

gam-g4

anthill-game-master

bugzilladata

dsv-cli

echarts-integration

kerasmultihead

elasticsearch-stubs

antvis

Optimizers

flake8executable

py-sds011

hccf

bithi-distributions

gostcrypto

fast-bitrix24

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

pdfutil 0.0.1

pdfutil的Python项目详细描述

PDFUTIL[开发中]

输入和输出

操作

检测表

检测文本区域[段落/非结构化内容]

检测非文本区域[图像/徽标]

检测语言

检测键值对

推荐PyPI第三方库

gam-g4

anthill-game-master

bugzilladata

dsv-cli

echarts-integration

kerasmultihead

elasticsearch-stubs

antvis

Optimizers

flake8executable

py-sds011

hccf

bithi-distributions

gostcrypto

fast-bitrix24

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签