使用openCV Python对不良背景图像进行二值化

2024-09-27 09:32:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试使用以下步骤对OCR的护照图像进行二值化:

img = cv2.medianBlur(nid_aligned_image,3)
img = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1] 

这种方法适用于更好的背景图像,但不适用于给定类型的图像

enter image description here

这是输出,OCR无法读取

enter image description here

有人能给我推荐一个更好的方法吗


Tags: 方法图像imageimgthreshold步骤cv2ocr
1条回答
网友
1楼 · 发布于 2024-09-27 09:32:21

我解决这个问题的方法是:


1-应用自适应阈值

2-应用形态变换

3-应用位运算

步骤1:自适应阈值


  • documentation开始:

    • if an image has different lighting conditions in different areas. In that case, adaptive thresholding can help. Here, the algorithm determines the threshold for a pixel based on a small region around it. So we get different thresholds for different regions of the same image which gives better results for images with varying illumination.

    • 总而言之:当用作阈值的全局值表现不佳时,将使用自适应阈值

    • img2 = cv2.imread("BESFs.png")
      gry2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
      
      flt = cv2.adaptiveThreshold(gry2,
                                   100, cv2.ADAPTIVE_THRESH_MEAN_C,
                                   cv2.THRESH_BINARY, 13, 16)
      
    • 结果:

      • enter image description here

第二步:形态变换


  • documentation开始:

    • It needs two inputs, one is our original image, second one is called structuring element or kernel which decides the nature of operation

  • 我们需要定义一个内核(过滤器)来处理图像

    • krn = np.ones((3, 3), np.uint8)
      
  • 我们将使用openingclosing

    • Opening is just another name of erosion followed by dilation. It is useful in removing noise

    • Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects, or small black points on the object.

  • opn = cv2.morphologyEx(flt, cv2.MORPH_OPEN, krn)
    cls = cv2.morphologyEx(opn, cv2.MORPH_CLOSE, krn)
    

步骤3:按位操作


  • documentation

    • They will be highly useful while extracting any part of the image

    • gry2 = cv2.bitwise_or(gry2, cls)
      
    • 结果:

      • enter image description here
  • 现在,如果我们使用pytesseract来提取文本

    • txt = pytesseract.image_to_string(gry2)
      txt = txt.rstrip().split('\n\n')[1].split(' ')[1]
      print("Passport number: {}".format(txt))
      
    • 结果:

      • Passport number: BC0874168
        

可选


对于您未来的OCR问题,您可以尝试提高图像分辨率。例如:

from PIL import Image

img = Image.open("BESFs.png")
h, w = img.size
fct = min(1, int(1024.0/h))
sz = int(fct * h), int(fct * w)
im_rsz = img.resize(sz, Image.ANTIALIAS)
im_rsz.save("out_dpi_300.png", dpi=(300, 300))

对于这个问题,它没有效果,但它可能会在将来帮助你

问题代码:

import cv2
import pytesseract
import numpy as np

img2 = cv2.imread("BESFs.png")
gry2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

flt = cv2.adaptiveThreshold(gry2,
                            100, cv2.ADAPTIVE_THRESH_MEAN_C,
                            cv2.THRESH_BINARY, 13, 16)
krn = np.ones((3, 3), np.uint8)
opn = cv2.morphologyEx(flt, cv2.MORPH_OPEN, krn)
cls = cv2.morphologyEx(opn, cv2.MORPH_CLOSE, krn)
gry2 = cv2.bitwise_or(gry2, cls)
txt = pytesseract.image_to_string(gry2)
txt = txt.rstrip().split('\n\n')[1].split(' ')[1]
print("Passport number: {}".format(txt))

相关问题 更多 >

    热门问题