如何使用cv2消除数字周围的噪声

2条回答

网友

1楼 · 编辑于 2024-09-27 07:21:33

这是一个挑战，但我认为我有一个有趣的方法：模式匹配

如果你放大，你会发现背面的图案只有4个可能的点，一个完整的像素，一个完整的双像素和一个中等左右的双像素。所以我所做的就是从17.160.000,00的图片中抓取这4种图案，然后开始工作。保存这些以便再次加载，我只是在飞行中抓住了它们

img = cv2.imread('C:/Users/***/17.jpg', cv2.IMREAD_GRAYSCALE)

pattern_1 = img[2:5,1:5]
pattern_2 = img[6:9,5:9]
pattern_3 = img[6:9,11:15]
pattern_4 = img[9:12,22:26]

# just to show it carries over to other pics ;)
img = cv2.imread('C:/Users/****/6.jpg', cv2.IMREAD_GRAYSCALE)

实际模式匹配

接下来，我们匹配所有模式和阈值以查找所有出现的情况，我使用了0.7，但您可以稍微使用它。这些图案去掉了侧面的一些像素，只匹配左侧的一个单像素，因此我们在前3个图案中填充两次（一个带有额外的一个）以同时命中这两个像素。最后一个是单个像素，因此它不需要它

res_1 = cv2.matchTemplate(img,pattern_1,cv2.TM_CCOEFF_NORMED )
thresh_1 = cv2.threshold(res_1,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_1 = np.pad(thresh_1,((1,1),(1,2)),'constant')
pat_thresh_15 = np.pad(thresh_1,((1,1),(2,1)), 'constant')
res_2 = cv2.matchTemplate(img,pattern_2,cv2.TM_CCOEFF_NORMED )
thresh_2 = cv2.threshold(res_2,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_2 = np.pad(thresh_2,((1,1),(1,2)),'constant')
pat_thresh_25 = np.pad(thresh_2,((1,1),(2,1)), 'constant')
res_3 = cv2.matchTemplate(img,pattern_3,cv2.TM_CCOEFF_NORMED )
thresh_3 = cv2.threshold(res_3,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_3 = np.pad(thresh_3,((1,1),(1,2)),'constant')
pat_thresh_35 = np.pad(thresh_3,((1,1),(2,1)), 'constant')
res_4 = cv2.matchTemplate(img,pattern_4,cv2.TM_CCOEFF_NORMED )
thresh_4 = cv2.threshold(res_4,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_4 = np.pad(thresh_4,((1,1),(1,2)),'constant')

编辑图像

现在唯一要做的就是从图像中删除所有匹配项。因为我们有一个大部分是白色的后卫，我们只是把他们设置为255来融入

img[pat_thresh_1==1] = 255
img[pat_thresh_15==1] = 255
img[pat_thresh_2==1] = 255
img[pat_thresh_25==1] = 255
img[pat_thresh_3==1] = 255
img[pat_thresh_35==1] = 255
img[pat_thresh_4==1] = 255

输出

编辑：

请看一看抽象答案，以改进此输出和tesseract微调

网友

2楼 · 编辑于 2024-09-27 07:21:33

您可以使用稍微复杂一点的方法，通过在频域而不是空间域中进行滤波来找到解决方案。阈值可能需要一些调整，具体取决于tesseract对输出图像的执行情况

实施：

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('C:\\Test\\number.jpg', cv2.IMREAD_GRAYSCALE)

# Perform 2D FFT
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
magnitude_spectrum = 20*np.log(np.abs(fshift))

# Squash all of the frequency magnitudes above a threshold
for idx, x in np.ndenumerate(magnitude_spectrum):
    if x > 195:
        fshift[idx] = 0

# Inverse FFT back into the real-spatial-domain
f_ishift = np.fft.ifftshift(fshift)
img_back = np.fft.ifft2(f_ishift)
img_back = np.real(img_back)
img_back = cv2.normalize(img_back, None, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
out_img = np.copy(img)

# Use the inverted FFT image to keep only the black values below a threshold
for idx, x in np.ndenumerate(img_back):
    if x < 100:
        out_img[idx] = 0
    else:
        out_img[idx] = 255

plt.subplot(131),plt.imshow(img, cmap = 'gray')
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(img_back, cmap = 'gray')
plt.title('Reversed FFT'), plt.xticks([]), plt.yticks([])
plt.subplot(133),plt.imshow(out_img, cmap = 'gray')
plt.title('Output'), plt.xticks([]), plt.yticks([])
plt.show()

输出：

中值模糊实现：

import cv2
import numpy as np

img = cv2.imread('C:\\Test\\number.jpg', cv2.IMREAD_GRAYSCALE)
blur = cv2.medianBlur(img, 3)

for idx, x in np.ndenumerate(blur):
    if x < 20:
        blur[idx] = 0

cv2.imshow("Test", blur)
cv2.waitKey()

输出：

最终编辑：

因此，使用Eumel的解决方案，并在其底部结合这段代码，可获得100%的成功结果：

img[pat_thresh_1==1] = 255
img[pat_thresh_15==1] = 255
img[pat_thresh_2==1] = 255
img[pat_thresh_25==1] = 255
img[pat_thresh_3==1] = 255
img[pat_thresh_35==1] = 255
img[pat_thresh_4==1] = 255

# Eumel's code above this line

img = cv2.erode(img, np.ones((3,3)))

cv2.imwrite("out.png", img)
cv2.imshow("Test", img)

print(pytesseract.image_to_string(Image.open("out.png"), lang='eng', config=' psm 10  oem 3 -c tessedit_char_whitelist=0123456789.,'))

输出图像示例：