I am trying to apply OCR using OpenCV and Python-tesseract to convert the following image to text: Original image.
But tesseract has not managed to correctly read the image as of yet. It reads:uleswylly Bie7 Srp a7 instead.
I have taken the following steps to pre-process the image before I feed it to tesseract:
- First I upscale the image:
# Image scaling
def set_image_dpi(img):
# Get current dimensions of the image
height, width = img.shape[:2]
# Define scale factor
scale_factor = 6
# Calculate new dimensions
new_height = int(height * scale_factor)
new_width = int(width * scale_factor)
# Resize image
return cv2.resize(img, (new_width, new_height))
Image result: result1.png
- Normalize the image:
# Normalization
norm_img = np.zeros((img.shape[0], img.shape[1]))
img = cv2.normalize(img, norm_img, 0, 255, cv2.NORM_MINMAX)
Image result: result2.png
- Then I remove some noise:
# Remove noise
def remove_noise(img):
return cv2.fastNlMeansDenoisingColored(img, None, 10, 10, 7, 15)
Image result: result3.png
- Get the grayscale image:
# Get grayscale
def get_grayscale(img):
return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Image result: result4.png
- Apply thresholding:
# Thresholding
def thresholding(img):
return cv2.threshold(img, 150, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) [1]
Image result: result5.png
- Invert the image color:
# Invert the image
def invert(img):
return cv2.bitwise_not(img)
Image result: result6.png
- Finally I pass the image to pytesseract:
# Pass preprocessed image to pytesseract
text = pytesseract.image_to_string(img)
print("Text found: " + text)
pytesseract output: "uleswylly Bie7 Srp a7"
I would like to improve my pre-processing so that pytesseract can actually read the image? Any help would be greatly appreciated!
Thanks in advance,
Steenert