用OpenCV自动调整纸张彩色照片的对比度和亮度

3条回答

网友
1楼 · 编辑于 2024-09-26 18:16:17

亮度和对比度可分别使用α（α）和β（β）进行调整。表达式可以写成
OpenCV已经将其实现为^{}，因此我们可以将此函数与用户定义的alpha和beta值一起使用。
import cv2 import numpy as np from matplotlib import pyplot as plt image = cv2.imread('1.jpg') alpha = 1.95 # Contrast control (1.0-3.0) beta = 0 # Brightness control (0-100) manual_result = cv2.convertScaleAbs(image, alpha=alpha, beta=beta) cv2.imshow('original', image) cv2.imshow('manual_result', manual_result) cv2.waitKey()
但问题是
How to get an automatic brightness/contrast optimization of a color photo?
本质上，问题是如何自动计算alpha和beta。为此，我们可以查看图像的直方图。自动亮度和对比度优化计算alpha和beta，以便输出范围为[0...255]。我们计算累积分布来确定颜色频率低于某个阈值（比如1%）的位置，并剪切直方图的左右两侧。这给了我们最小和最大的范围。这是一个可视化的直方图之前（蓝色）和之后剪辑（橙色）。请注意，在剪切之后，图像中更“有趣”的部分是如何发音的。
为了计算alpha，我们在剪切后取最小和最大灰度范围，并将其与期望的输出范围255分开
α = 255 / (maximum_gray - minimum_gray)
为了计算beta，我们将其插入公式中，其中g(i, j)=0和f(i, j)=minimum_gray
g(i,j) = α * f(i,j) + β
在解决了这个问题之后
β = -minimum_gray * α
为了你的形象我们得到了这个
Alpha: 3.75
Beta: -311.25
可能需要调整剪裁阈值以优化结果。下面是对其他图像使用1%阈值的一些示例结果
自动亮度和对比度代码
import cv2 import numpy as np from matplotlib import pyplot as plt # Automatic brightness and contrast optimization with optional histogram clipping def automatic_brightness_and_contrast(image, clip_hist_percent=1): gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Calculate grayscale histogram hist = cv2.calcHist([gray],[0],None,[256],[0,256]) hist_size = len(hist) # Calculate cumulative distribution from the histogram accumulator = [] accumulator.append(float(hist[0])) for index in range(1, hist_size): accumulator.append(accumulator[index -1] + float(hist[index])) # Locate points to clip maximum = accumulator[-1] clip_hist_percent *= (maximum/100.0) clip_hist_percent /= 2.0 # Locate left cut minimum_gray = 0 while accumulator[minimum_gray] < clip_hist_percent: minimum_gray += 1 # Locate right cut maximum_gray = hist_size -1 while accumulator[maximum_gray] >= (maximum - clip_hist_percent): maximum_gray -= 1 # Calculate alpha and beta values alpha = 255 / (maximum_gray - minimum_gray) beta = -minimum_gray * alpha ''' # Calculate new histogram with desired range and show histogram new_hist = cv2.calcHist([gray],[0],None,[256],[minimum_gray,maximum_gray]) plt.plot(hist) plt.plot(new_hist) plt.xlim([0,256]) plt.show() ''' auto_result = cv2.convertScaleAbs(image, alpha=alpha, beta=beta) return (auto_result, alpha, beta) image = cv2.imread('1.jpg') auto_result, alpha, beta = automatic_brightness_and_contrast(image) print('alpha', alpha) print('beta', beta) cv2.imshow('auto_result', auto_result) cv2.waitKey()
带有此代码的结果图像：
使用1%阈值的其他图像的结果

网友
2楼 · 编辑于 2024-09-26 18:16:17

鲁棒的局部自适应软二值化！这就是我所说的
我以前也做过类似的工作，目的有点不同，所以这可能不完全适合您的需要，但希望它能有所帮助（我晚上写这段代码是为了个人使用，所以很难看）。从某种意义上说，与您的代码相比，这段代码旨在解决一个更为一般的情况，在这种情况下，我们可以在后台有很多结构化噪声（请参见下面的演示）。
What this code does? Given a photo of a sheet of paper, it will whiten it so that it can be perfectly printable. See example images below.
摘要：这就是这个算法之后（之前和之后）页面的外观。注意，甚至颜色标记注释也不见了，所以我不知道这是否适合您的用例，但是代码可能很有用：
要获得一个完全干净的结果，您可能需要对过滤参数稍作修改，但正如您所看到的，即使使用默认参数，它也能很好地工作。
第0步：剪切图像使其与页面紧密配合
让我们看看你是怎么做到的（在你提供的例子中似乎是这样）。如果你需要一个手动注释和重写工具，只要下午我！^^此步骤的结果如下（我在这里使用的示例可能比您提供的示例更难，但可能与您的情况不完全匹配）：
由此我们可以立即看到以下问题：
光照条件不均匀。这意味着所有简单的二值化方法都不起作用。我尝试了很多在OpenCV中可用的解决方案，以及它们的组合，但都没有成功！
有很多背景噪音。在我的例子中，我需要去除纸张的网格，以及纸张另一面的墨水，这些墨水可以通过薄片看到。
步骤1：伽马校正
这一步的理由是平衡整个图像的对比度（因为根据照明条件，您的图像可能会稍微曝光过度/曝光不足）。
一开始这看起来是一个不必要的步骤，但它的重要性不可低估：在某种意义上，它将图像标准化为相似的曝光分布，以便以后可以选择有意义的超参数（例如，下一节中的DELTA参数，噪声过滤参数，形态材料参数等）
# Somehow I found the value of `gamma=1.2` to be the best in my case def adjust_gamma(image, gamma=1.2): # build a lookup table mapping the pixel values [0, 255] to # their adjusted gamma values invGamma = 1.0 / gamma table = np.array([((i / 255.0) ** invGamma) * 255 for i in np.arange(0, 256)]).astype("uint8") # apply gamma correction using the lookup table return cv2.LUT(image, table)
以下是伽玛调整的结果：
你可以看到，它是有点多…”平衡“现在。如果没有这一步，您将在后面的步骤中手动选择的所有参数都将变得不那么健壮！
步骤2：自适应二值化以检测文本块
在这一步中，我们将自适应地对文本blob进行二值化。稍后我将添加更多评论，但基本上是这样的：
我们将图像分成大小为BLOCK_SIZE的块。诀窍是选择足够大的大小，这样你仍然可以得到大量的文本和背景（即比你拥有的任何符号都大），但要小到不受任何照明条件变化的影响（即“大，但仍然是局部的”）。
在每个块中，我们进行局部自适应二值化：我们查看中值，并假设它是背景（因为我们选择了足够大的BLOCK_SIZE作为背景）。然后，我们进一步定义DELTA——基本上只是一个阈值“我们仍将它视为背景距中值有多远？”。
因此，函数process_image完成任务。此外，您可以修改preprocess和postprocess函数以满足您的需要（但是，正如您从上面的示例中看到的，该算法非常健壮，即它在不修改太多参数的情况下非常好地开箱即用）。
此部分的代码假定前景比背景暗（即纸上的墨水）。但是你通过调整preprocess函数可以很容易地改变这种情况：只返回image，而不是255 - image。
# These are probably the only important parameters in the # whole pipeline (steps 0 through 3). BLOCK_SIZE = 40 DELTA = 25 # Do the necessary noise cleaning and other stuffs. # I just do a simple blurring here but you can optionally # add more stuffs. def preprocess(image): image = cv2.medianBlur(image, 3) return 255 - image # Again, this step is fully optional and you can even keep # the body empty. I just did some opening. The algorithm is # pretty robust, so this stuff won't affect much. def postprocess(image): kernel = np.ones((3,3), np.uint8) image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel) return image # Just a helper function that generates box coordinates def get_block_index(image_shape, yx, block_size): y = np.arange(max(0, yx[0]-block_size), min(image_shape[0], yx[0]+block_size)) x = np.arange(max(0, yx[1]-block_size), min(image_shape[1], yx[1]+block_size)) return np.meshgrid(y, x) # Here is where the trick begins. We perform binarization from the # median value locally (the img_in is actually a slice of the image). # Here, following assumptions are held: # 1. The majority of pixels in the slice is background # 2. The median value of the intensity histogram probably # belongs to the background. We allow a soft margin DELTA # to account for any irregularities. # 3. We need to keep everything other than the background. # # We also do simple morphological operations here. It was just # something that I empirically found to be "useful", but I assume # this is pretty robust across different datasets. def adaptive_median_threshold(img_in): med = np.median(img_in) img_out = np.zeros_like(img_in) img_out[img_in - med < DELTA] = 255 kernel = np.ones((3,3),np.uint8) img_out = 255 - cv2.dilate(255 - img_out,kernel,iterations = 2) return img_out # This function just divides the image into local regions (blocks), # and perform the `adaptive_mean_threshold(...)` function to each # of the regions. def block_image_process(image, block_size): out_image = np.zeros_like(image) for row in range(0, image.shape[0], block_size): for col in range(0, image.shape[1], block_size): idx = (row, col) block_idx = get_block_index(image.shape, idx, block_size) out_image[block_idx] = adaptive_median_threshold(image[block_idx]) return out_image # This function invokes the whole pipeline of Step 2. def process_image(img): image_in = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) image_in = preprocess(image_in) image_out = block_image_process(image_in, BLOCK_SIZE) image_out = postprocess(image_out) return image_out
结果是像这样的漂亮斑点，紧跟着墨迹：
第三步：二值化的“软”部分
有了覆盖符号和一点点更多的斑点，我们终于可以做美白程序。
如果我们更仔细地观察带有文字的纸张（特别是那些有手写体的纸张）的照片，从“背景”（白纸）到“前景”（深色墨水）的转换不是很尖锐，而是非常渐进的。本节中的其他基于二值化的答案提出了一个简单的阈值（即使它们是局部自适应的，它仍然是一个阈值），这对于打印文本来说是可行的，但是对于手写文本来说不会产生那么好的结果。
因此，本节的动机是，我们希望保留从黑色到白色的渐进式传输的效果，就像使用天然墨水的纸张的自然照片一样。其最终目的是使其可打印。
其主要思想很简单：像素值（在上述阈值之后）与局部最小值的差异越大，它就越有可能属于背景。我们可以使用一系列Sigmoid函数来表示，重新缩放到局部块的范围（以便该函数在图像中自适应地缩放）。
# This is the function used for composing def sigmoid(x, orig, rad): k = np.exp((x - orig) * 5 / rad) return k / (k + 1.) # Here, we combine the local blocks. A bit lengthy, so please # follow the local comments. def combine_block(img_in, mask): # First, we pre-fill the masked region of img_out to white # (i.e. background). The mask is retrieved from previous section. img_out = np.zeros_like(img_in) img_out[mask == 255] = 255 fimg_in = img_in.astype(np.float32) # Then, we store the foreground (letters written with ink) # in the `idx` array. If there are none (i.e. just background), # we move on to the next block. idx = np.where(mask == 0) if idx[0].shape[0] == 0: img_out[idx] = img_in[idx] return img_out # We find the intensity range of our pixels in this local part # and clip the image block to that range, locally. lo = fimg_in[idx].min() hi = fimg_in[idx].max() v = fimg_in[idx] - lo r = hi - lo # Now we use good old OTSU binarization to get a rough estimation # of foreground and background regions. img_in_idx = img_in[idx] ret3,th3 = cv2.threshold(img_in[idx],0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU) # Then we normalize the stuffs and apply sigmoid to gradually # combine the stuffs. bound_value = np.min(img_in_idx[th3[:, 0] == 255]) bound_value = (bound_value - lo) / (r + 1e-5) f = (v / (r + 1e-5)) f = sigmoid(f, bound_value + 0.05, 0.2) # Finally, we re-normalize the result to the range [0..255] img_out[idx] = (255. * f).astype(np.uint8) return img_out # We do the combination routine on local blocks, so that the scaling # parameters of Sigmoid function can be adjusted to local setting def combine_block_image_process(image, mask, block_size): out_image = np.zeros_like(image) for row in range(0, image.shape[0], block_size): for col in range(0, image.shape[1], block_size): idx = (row, col) block_idx = get_block_index(image.shape, idx, block_size) out_image[block_idx] = combine_block( image[block_idx], mask[block_idx]) return out_image # Postprocessing (should be robust even without it, but I recommend # you to play around a bit and find what works best for your data. # I just left it blank. def combine_postprocess(image): return image # The main function of this section. Executes the whole pipeline. def combine_process(img, mask): image_in = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) image_out = combine_block_image_process(image_in, mask, 20) image_out = combine_postprocess(image_out) return image_out
有些东西是可选的，因此会被评论。combine_process函数接受上一步的掩码，并执行整个合成管道。你可以试着用它们来玩弄你的特定数据（图像）。结果很清楚：
可能我会在这个答案中为代码添加更多的注释和解释。将在Github上上载整个内容（以及裁剪和扭曲代码）。

网友
3楼 · 编辑于 2024-09-26 18:16:17

这种方法对您的应用程序应该很有效。首先在强度直方图中找到一个很好地分离分布模式的阈值，然后使用该值重新缩放强度。

from skimage.filters import threshold_yen
from skimage.exposure import rescale_intensity
from skimage.io import imread, imsave

img = imread('mY7ep.jpg')

yen_threshold = threshold_yen(img)
bright = rescale_intensity(img, (0, yen_threshold), (0, 255))

imsave('out.jpg', bright)

我在这里使用Yen的方法，可以在this page上了解更多关于此方法的信息。

鲁棒的局部自适应软二值化！这就是我所说的

第0步：剪切图像使其与页面紧密配合

步骤1：伽马校正

步骤2：自适应二值化以检测文本块

第三步：二值化的“软”部分

相关问题更多 >

编程相关推荐

热门问题

热门文章