Unable to read text from Image using pytesseract.image_to_string

Question

The problem here is I need to remove the lines and write code to recognize the characters. Till now I have seen solutions, where the char was in solid, but this has char with double border.

Most of the captcha generators are designed in a way that, traditional tools like tesseract can't detect their output (They will apply the publicly available tools like tesseract to the sample batch of generated captcha and if in most cases tools like tesseract fails they will publish that mode of captcha generation). So this is normal, you can check other OCR tools, the strongest is, I think Google Cloud's Vision API (e.g. Google Lens). I tested hand-modified versions of the image and find out as you mentioned we should remove the lines and fill the characters by cv's countors. — Parano, May 24 '21 at 17:22

HansHirse · Accepted Answer · 2021-05-25T08:34:50.640

For this specific captcha, there's quite a simple solution. But, there's no guarantee for this approach to work on other, even very similar captchas – due to the "nature" of captchas as already mentioned in the comments, and in general when dealing with image-processing tasks with limited provided input data.

Read the image as grayscale.
Threshold the image at nearly white cutoff.
Flood fill the "background" with black.
Run pytesseract with -psm 6 option.

That'd be the whole code:

import cv2
import pytesseract

# Read image as grayscale
img = cv2.imread('FuZEJ.png', cv2.IMREAD_GRAYSCALE)

# Threshold at nearly white cutoff
thr = cv2.threshold(img, 224, 255, cv2.THRESH_BINARY)[1]

# Floodfill "background" with black
ff = cv2.floodFill(thr, None, (0, 0), 0)[1]

# OCR using pytesseract
text = pytesseract.image_to_string(ff, config='--psm 6').replace('\n', '').replace('\f', '')
print(text)
# xwphs

Caveat: I use a special version of Tesseract from the Mannheim University Library.

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.16299-SP0
Python:        3.9.1
PyCharm:       2021.1.1
OpenCV:        4.5.1
pytesseract:   5.0.0-alpha.20201127
----------------------------------------

score 2 · Answer 2 · answered May 26 '21 at 19:33

I would try a mask:

import cv2
import numpy as np

def process(img): # To process the image
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    _, img_gray = cv2.threshold(img_gray, 224, 255, cv2.THRESH_TOZERO_INV)
    img_blur = cv2.GaussianBlur(img_gray, (7, 7), 6)
    img_canny = cv2.Canny(img_blur, 0, 100)
    return cv2.dilate(img_canny, np.ones((1, 5)), iterations=1)

def get_mask(img): # To generate the mask
    mask = np.zeros(img.shape[:2], 'uint8')
    contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    for cnt in contours:
        cv2.drawContours(mask, [cnt], -1, 255, -1)
    return mask

def crop(img, mask): # To mask an image and use white background
    bg = np.full(img.shape, 255, 'uint8')
    fg = cv2.bitwise_or(img, img, mask=mask)            
    fg_back_inv = cv2.bitwise_or(bg, bg, mask=cv2.bitwise_not(mask))
    return cv2.bitwise_or(fg, fg_back_inv)

img = cv2.imread("image.png")
img = cv2.pyrUp(cv2.pyrUp(img)) # To enlarge image by 4x
cv2.imshow("Masked Image", crop(img, get_mask(img)))
cv2.waitKey(0)

Before:

After:

Unable to read text from Image using pytesseract.image_to_string

2 Answers2