Python boxdetect包_程序模块 - PyPI

boxdetect是一个基于OpenCV的Python包，它允许您轻松地检测矩形形状，例如扫描表单上的字符框。

boxdetect的Python项目详细描述

共享：

BoxDetect是一个基于OpenCV的Python包，允许您轻松地检测扫描表单上的字符或复选框等矩形形状。在

这个库的主要目的是为处理文档图像（如银行表单、应用程序等）提供有用的功能，并提取出现字符框或勾选框的区域。在

特点

boxdetect.pipelines.get_boxes-盒提取的基本管道
boxdetect.pipelines.get_checkboxes-只返回带有简单状态估计的复选框的管道（checked/unchecked）
boxdetect.config.PipelinesConfig-用于运行管道的高级配置类
boxdetect.config.PipelinesConfig.save_yaml/load_yaml-允许在yaml文件中保存和加载配置
boxdetect.config.PipelinesConfig.autoconfigure-一种简单的机制，根据您要查找的框大小列表自动设置配置
boxdetect.config.PipelinesConfig.autoconfigure_from_vott-根据VoTT中的基本真相/注释json文件自动设置配置
boxdetect.img_proc和boxdetect.rect_proc-可用于构建自定义管道的实用程序函数

入门

签出usage examples below以更好地了解它的工作原理，或者转到get-started-pipelines.ipynb和get-started-autoconfig.ipynb笔记本，其中包含使用BoxDetect与预制作的boxdetect.pipelines函数一起使用的逐步示例。在

安装

BoxDetect可以使用pip直接从此repo安装：

pip install git+https://github.com/karolzak/boxdetect

或通过PyPI

^{pr2}$

使用示例

您可以使用BoxDetect，方法是利用一个预先制作好的管道，或者将BoxDetect函数作为工具箱来组合自己的管道，以完美地满足您的需要。在

Using pre-made pipelines
在

使用`boxdetect.pipelines`

检测字符框并将它们组合在一起

[back to usage examples]

首先获取默认值PipelinesConfig，然后根据您的需求和数据调整它：

fromboxdetectimportconfigfile_name='form_example1.png'cfg=config.PipelinesConfig()# important to adjust these values to match the size of boxes on your imagecfg.width_range=(30,55)cfg.height_range=(25,40)# the more scaling factors the more accurate the results but also it takes more time to processing# too small scaling factor may cause false positives# too big scaling factor will take a lot of processing timecfg.scaling_factors=[0.7]# w/h ratio range for boxes/rectangles filteringcfg.wh_ratio_range=(0.5,1.7)# group_size_range starting from 2 will skip all the groups# with a single box detected inside (like checkboxes)cfg.group_size_range=(2,100)# num of iterations when running dilation tranformation (to engance the image)cfg.dilation_iterations=0

作为第二步，只需运行：

fromboxdetect.pipelinesimportget_boxesrects,grouping_rects,image,output_image=get_boxes(file_name,cfg=cfg,plot=False)

grouping_rects中返回的每个元素都是矩形边框，表示分组字符框（x，y，w，h）

print(grouping_rects)OUT:# (x, y, w, h)[(276,276,1221,33),(324,466,430,33),(384,884,442,33),(985,952,410,32),(779,1052,156,33),(253,1256,445,33)]

显示输出图像，并在其上绘制边界矩形

plt.figure(figsize=(20,20))plt.imshow(output_image)plt.show()

只突出显示复选框

[back to usage examples]

如果只想突出显示复选框，只需更改一个参数：

# limit down the grouping algorithm to just singular boxes (e.g. checkboxes)cfg.group_size_range=(1,1)

使用`boxdetect.pipelines.get_checkboxes`检索和识别复选框

[back to usage examples]

假设我们使用的是相同的图像，并且配置已经调整（look above），我们只需要运行：

fromboxdetect.pipelinesimportget_checkboxescheckboxes=get_checkboxes(file_path,cfg=cfg,px_threshold=0.1,plot=False,verbose=True)

如果verbose=True它将打印出一堆被检测到的复选框的详细信息以及对其状态的估计：

Processing file:  ../images/form_example1.png
----------------------------------
nonzero_px_count:  3
all_px_count:  858
nonzero_px_count / all_px_count =  0.0034965034965034965
----------------------------------
----------------------------------
nonzero_px_count:  363
all_px_count:  858
nonzero_px_count / all_px_count =  0.4230769230769231
----------------------------------

现在来看看我们结果的细节：

print("Output object type: ",type(checkboxes))forcheckboxincheckboxes:print("Checkbox bounding rectangle (x,y,width,height): ",checkbox[0])print("Result of `contains_pixels` for the checkbox: ",checkbox[1])print("Display the cropout of checkbox:")plt.figure(figsize=(1,1))plt.imshow(checkbox[2])plt.show()

我们应该看到以下几点：

{ehs>使用快速设置和列表框的大小

[back to usage examples]

BoxDetect允许您提供您感兴趣的框的大小（h，w）列表，并基于该列表自动设置配置来检测这些框。在

fromboxdetectimportconfigcfg=config.PipelinesConfig()# The values I'm providing below is a list of box sizes I'm interested in and want to focus on# [(h, w), (h, w), ...]cfg.autoconfigure([(46,46),(44,43)])

完成后，您可以使用boxdetect.pipelines函数，如下所示：

fromboxdetect.pipelinesimportget_checkboxescheckboxes=get_checkboxes(file_path,cfg=cfg,plot=False)

使用`boxdetect.config.PipelinesConfig.autoconfigure_from_vott`根据注释的基本事实快速轻松地设置配置参数

[back to usage examples]

另一个选择是使用VoTT的基本真相注释。
查看vottrepo和文档，了解如何创建新项目并开始标记数据：https://github.com/microsoft/VoTT

在这个例子中，我使用VoTT来标记我的输入图像，我的VoTT项目看起来有点像：

原则上，你只需要为每一个不同的大小标记一个框，但是你要标注的框越多-结果应该越准确。在

fromboxdetectimportconfigcfg=config.PipelinesConfig()cfg.autoconfigure_from_vott(vott_dir="../tests/data/autoconfig_simple",class_tags=["box"])

完成后，您可以使用boxdetect.pipelines函数，如下所示：

fromboxdetect.pipelinesimportget_checkboxescheckboxes=get_checkboxes(file_path,cfg=cfg,plot=False)

保存和加载配置到`yaml`文件

[back to usage examples]

如果您想保存一个特定的配置以备以后重用或自动化，您可以使用PipelinesConfig函数：save_yaml和{}，如下所示：

fromboxdetectimportconfigcfg=config.PipelinesConfig()cfg.morph_kernels_thickness=10cfg.save_yaml('test_cfg.yaml')cfg2.load_yaml('test_cfg.yaml')

欢迎加入QQ群-->： 979659372

boxdetect 1.0.0

boxdetect的Python项目详细描述

特点

入门

安装

使用示例

使用`boxdetect.pipelines`

检测字符框并将它们组合在一起

只突出显示复选框

使用`boxdetect.pipelines.get_checkboxes`检索和识别复选框

使用`boxdetect.config.PipelinesConfig.autoconfigure_from_vott`根据注释的基本事实快速轻松地设置配置参数

保存和加载配置到`yaml`文件

推荐PyPI第三方库

js.jquery_textchildren

currencycloud

gym-risk

plone.pon

django-materialize-nav

Flask-Bitmapist

scabbard

m3-designer

mailsnake

inquer

ushuffle

mammoth-analytics-sdk

docstringargs

django-mediaelementjs

sodapclient

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

boxdetect 1.0.0

boxdetect的Python项目详细描述

特点

入门

安装

使用示例

使用boxdetect.pipelines

检测字符框并将它们组合在一起

只突出显示复选框

使用boxdetect.pipelines.get_checkboxes检索和识别复选框

使用boxdetect.config.PipelinesConfig.autoconfigure_from_vott根据注释的基本事实快速轻松地设置配置参数

保存和加载配置到yaml文件

推荐PyPI第三方库

js.jquery_textchildren

currencycloud

gym-risk

plone.pon

django-materialize-nav

Flask-Bitmapist

scabbard

m3-designer

mailsnake

inquer

ushuffle

mammoth-analytics-sdk

docstringargs

django-mediaelementjs

sodapclient

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

使用`boxdetect.pipelines`

使用`boxdetect.pipelines.get_checkboxes`检索和识别复选框

使用`boxdetect.config.PipelinesConfig.autoconfigure_from_vott`根据注释的基本事实快速轻松地设置配置参数

保存和加载配置到`yaml`文件

导航栏

项目链接

标签