改进灰度图像的分形检测

2024-10-02 14:28:33 发布

您现在位置：Python中文网/ 问答频道 /正文

8456

网友

男 | 程序猿一只，喜欢编程写python代码。

我对对比度低的相同文件的文本识别有问题。我使用的是PyteSeract，一些文件，比如这样，完全不返回任何内容：http://i.imgur.com/l91O5JH.png

我使用PyTesseract的LineBoxBuilder。在此之前，我将PDF转换为JPG：

def save_img_with_wand(self, pdfName, output):
    with Img(filename=pdfName, resolution=300) as pic:
        pic.compression_quality = 100
        pic.background_color    = Color("white")
        pic.alpha_channel       = 'remove'
        pic.save(filename=output)

线框生成器：

def line_box_builder(self, img):
    try:
        return self.tool.image_to_string(
            img,
            lang=self.lang,
            builder=pyocr.builders.LineBoxBuilder()
        )

    except pytesseract.pytesseract.TesseractError as t:
        self.Log.error('Tesseract ERROR : ' + str(t))

如果没有发现任何东西，我将使用OpenCV改进检测：

@staticmethod
def improve_image_detection(img):
    src     = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
    dst     = cv2.adaptiveThreshold(src, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11, 2)
    cv2.imwrite(img, dst)

我尝试了多种OpenCV解决方案，但在所有情况下，我都无法在像上图那样糟糕的背景下阅读文本

提前谢谢你的帮助

Tags：文件文本 self img output save def as

0条回答

目前没有回答

改进灰度图像的分形检测

相关问题更多 >

编程相关推荐

热门问题

热门文章

改进灰度图像的分形检测

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >