I am looking to extract some text from a meter display using Python 3 and OpenCV. I have had some success with a lot of help from SO on the code below, and I can print text from a basic and 'tidy' image file. However when I look to extract from the attached dot matrix image the script is unable to pick out any text at all.
Is there a limitation to extracting in this kind of dot matrix text?
Here's what I am working with:
import cv2
import numpy as np
import pytesseract
from PIL import Image
from cv2 import boundingRect, countNonZero, cvtColor, drawContours, findContours, getStructuringElement, \
imread, morphologyEx, pyrDown, rectangle, threshold
img = imread('test.JPG')
# down sample and use it for processing
adjusted = pyrDown(img)
# gray-scale image
img_gray = cvtColor(adjusted, cv2.COLOR_BGR2GRAY)
# morph gradient
morph_kernel = getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = morphologyEx(img_gray, cv2.MORPH_GRADIENT, morph_kernel)
# change to binary and morph
_, bw = threshold(src=grad, thresh=0, maxval=255, type=cv2.THRESH_BINARY+cv2.THRESH_OTSU)
morph_kernel = getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = morphologyEx(bw, cv2.MORPH_CLOSE, morph_kernel)
applyMask = np.zeros(bw.shape, np.uint8)
# get contours
im2, contour, hierarchy = findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# filter contours
for index in range(0, len(hierarchy[0])):
rect = x, y, rectangle_width, rectangle_height = boundingRect(contour[index])
# draw contour
mask = drawContours(applyMask, contour, index, (255, 255, 2555), cv2.FILLED)
# find non-zero pixels ratio
r = float(countNonZero(applyMask)) / (rectangle_width * rectangle_height)
if r > 0.5 and rectangle_height > 8 and rectangle_width > 8:
rec_img = rectangle(adjusted, (x, y+rectangle_height), (x+rectangle_width, y), (0, 255, 0), 3)
text = pytesseract.image_to_string(Image.fromarray(rec_img))
print(text)
And here's the image I am trying to extract from: