I'm currently working on this program of converting from image to text efficiently using pytesseract library in python. I'm having trouble of getting result. For some cases the output is correct but in some cases, there is an incorrect output.
For example,
The output of the above image is correct i.e.,
U3DS
But, in the case of this below image,
the output is getting in a correct manner. It is showing,
ss
So guys, How can I PreProcess the image in more efficient way so that OCR engine can recognize?
the code of this program:
import pytesseract as pt
import cv2
from PIL import Image
import numpy as np
pt.pytesseract.tesseract_cmd = r"C:\Users\user\AppData\Local\Programs\Tesseract-OCR\tesseract.exe"
img = cv2.imread("dd.png") #U3DS
img = cv2.resize(img,(int(img.shape[1]/.75),int(img.shape[0]/.75)))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
bl = cv2.bilateralFilter(gray,9,5,5)
th = cv2.threshold(bl, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU )[1]
kernel = np.array([[0,-1,0],[-1,5,-1],[0,-1,0]])
im = cv2.filter2D(th, -1, kernel)
kernel = np.ones((5,5), np.uint8)
cv2.imshow('',im)
cv2.waitKey(0)
cv2.destroyAllWindows
test = pt.image_to_string(im,config = "--psm 10")
print(test)