Preprocessing image for Text OCR in Python

Question

I'm currently working on this program of converting from image to text efficiently using pytesseract library in python. I'm having trouble of getting result. For some cases the output is correct but in some cases, there is an incorrect output.

For example,

The output of the above image is correct i.e.,

U3DS

But, in the case of this below image,

the output is getting in a correct manner. It is showing,

ss

So guys, How can I PreProcess the image in more efficient way so that OCR engine can recognize?

the code of this program:

import pytesseract as pt
import cv2
from PIL import Image
import numpy as np

pt.pytesseract.tesseract_cmd = r"C:\Users\user\AppData\Local\Programs\Tesseract-OCR\tesseract.exe"


img = cv2.imread("dd.png") #U3DS

img = cv2.resize(img,(int(img.shape[1]/.75),int(img.shape[0]/.75)))

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

bl = cv2.bilateralFilter(gray,9,5,5)

th = cv2.threshold(bl, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU )[1]


kernel  = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
im = cv2.filter2D(th, -1, kernel)

kernel = np.ones((5,5), np.uint8)

cv2.imshow('',im)
cv2.waitKey(0)
cv2.destroyAllWindows

test = pt.image_to_string(im,config = "--psm 10")
print(test)

It's hard to use computer vision on images *specifically* designed to be hard for computers to read. You may be able to find the bounding rect then matrix correct the distortion a little, though it won't work in all situations. — mousetail, Jun 09 '22 at 14:23
I’m voting to close this question because this is yet another request to solve Catpchas with OpenCV. — bfris, Jun 09 '22 at 14:41
Here is a small sample of all of the "please help me defeat Captcha" questions over the years here: [this one](https://stackoverflow.com/q/13664161/9705687), [this one](https://stackoverflow.com/q/63091310/9705687), [this one](https://stackoverflow.com/q/58872451/9705687), [this one](https://stackoverflow.com/q/72493869/9705687), [this one](https://stackoverflow.com/q/54387001/9705687). Or [let me Google that for you](https://stackoverflow.com/search?q=python+captcha). — bfris, Jun 09 '22 at 14:48
@bfris Thank you for giving me the links of the questions. Before asking this question here, I already have searched on Google for that but I couldn't find any satisfactory answer. So that's why I asked the question here. — simransharma3, Jun 10 '22 at 05:38

Preprocessing image for Text OCR in Python

0 Answers0