使用PyteSeract python模块识别图像中的文本时出现的问题

1条回答

网友

1楼 · 发布于 2024-09-30 20:23:31

让我们观察一下你的代码在做什么

我们需要看到文本的哪一部分被本地化和检测
为了理解代码行为，我们将使用image_to_data函数
image_to_data将显示检测到的图像部分

# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')

# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)

# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)

# Get ROI part from the detection
n_boxes = len(d['level'])

# For each detected part
for i in range(n_boxes):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Initialize shape for displaying the current localized region
    shape = [(x, y), (w, h)]

    # Draw the region
    finalImgDraw.rectangle(shape, outline="red")

    # Display
    finalImg.show()

    # OCR "psm 6: Assume a single uniform block of text."
    txt = pytesseract.image_to_string(cropped, config=" psm 6")

    # Result
    print(txt)

结果:

```
i
I
```

“```”

因此，结果是图像本身显示未检测到任何内容。代码不起作用。输出不显示所需的结果
可能有各种原因
以下是输入图像的一些事实：
- 二值图像
- 大矩形工件
- 文本有点夸张

如果不进行测试，我们无法知道图像是否需要pre-processing
我们确信这个黑色的大矩形是一件艺术品。我们需要移除工件。一种解决方案是选择图像的一部分
为了选择图像的一部分，我们需要使用^{}和一些尝试和错误来找到roi
- 如果我们把图像按高度分成两部分。我们不希望另一个工件包含一半
- 乍一看，我们想要（0->；height/2）。如果使用这些值，您可以看到确切的文本位置在（height/6->；height/4）之间
结果将是：
```
$1,582
```
代码：

# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')

# Get height and width of the image
w, h = finalImg.size

# Get part of the desired text
finalImg = finalImg.crop((0, int(h/6), w, int(h/4)))

# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)

# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)

# Get ROI part from the detection
n_boxes = len(d['level'])

# For each detected part
for i in range(n_boxes):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Initialize shape for displaying the current localized region
    shape = [(x, y), (w, h)]

    # Draw the region
    finalImgDraw.rectangle(shape, outline="red")

    # Display
    finalImg.show()

    # OCR "psm 6: Assume a single uniform block of text."
    txt = pytesseract.image_to_string(cropped, config=" psm 6")

    # Result
    print(txt)

如果无法获得与我相同的解决方案，则需要使用以下方法检查PyteSeract版本：

print(pytesseract.get_tesseract_version())

对我来说，结果是4.1.1

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用PyteSeract python模块识别图像中的文本时出现的问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >