如何在pytesser中只启用数字？

2024-09-27 07:17:31 发布

男 | 程序猿一只，喜欢编程写python代码。

我正在运行pytesser来OCR python中的图像。我第一次从页面上截取一张图片，没问题，但在接下来的几页中，精确度会下降，直到87+1是$+$

奇怪吧？我的猜测是因为pytesser（python的tesseract端口）是用来识别单词的，并将OCR放入下一个问题的上下文中。所以，没有办法禁用它，我只能把它设置为数字，对吗？但是pytesser没有太多关于它的文档，所以我继续阅读tesseract的faq，但是我并没有真正得到代码。在

Use
TessBaseAPI::SetVariable("tessedit_char_whitelist", "0123456789");
BEFORE calling an Init function or put this in a text file called tessdata/configs/digits:
tessedit_char_whitelist 0123456789
and then your command line becomes:
tesseract image.tif outputbase nobatch digits
Warning: Until the old and new config variables get merged, you must have the nobatch parameter too.

我猜测TAT是C或C++的。在python中有什么方法可以做到这一点吗？或者更好，禁用OCR的上下文？在

Tags： and the 端口图像图片页面 ocr whitelist

1条回答

网友

1楼 · 发布于 2024-09-27 07:17:31

在python中：

import tesseract
ocr = tesseract.TessBaseAPI();
ocr.Init(".","eng",tesseract.OEM_TESSERACT_ONLY)
ocr.SetVariable("tessedit_char_whitelist", "0123456789")

您可能还希望：

^{pr2}$

如何在pytesser中只启用数字？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在pytesser中只启用数字？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >