Image recognition difficulties with OCR - reading numbers from a picture

Question

I am trying to develop a python script which can read numbers from pictures, to be more exact I am trying to get the gas consumption. The numbers' locations are always the same. There are two "types" of pics, bright and dark. (I am taking photos every 10 mins so I have a lot of examples if needed)

I would like to get as a result 8 digits. e.g. 10974748 (from the dark pic)

I am mainly using Pytesseract and OpenCV2.

So far the best solution seemes to be that first I crop the needed part of the picture than I use pytesseract.image_to_string() with config = --psm 7. But unfortunately it is really not a reliable solution, it can not recognize the same digit combinations when there were no consumption but photos were taken.

import cv2
import numpy as np
import os
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract"
directory = r"C:\Users\user\Desktop\test_pcs\test"

for image in os.listdir(directory):
    
    OriginalImagePath = os.path.join(directory, image)
    OriginalImage = cv2.imread(OriginalImagePath)
    x_start, y_start = int(1110), int(445)
    x_end, y_end = int(1690), int(520)
    cropped_image = OriginalImage[y_start:y_end, x_start:x_end]
    text = (pytesseract.image_to_string(cropped_image, config="--psm 7 outputbase digits"))
    cv2.imshow("Cropped", cropped_image)
    cv2.waitKey(0)
    print(text + "    " + OriginalImagePath)
    
cv2.destroyAllWindows()

After that I tried using thresholding, but sadly I get worse results than with the simple image_to_string. Adaptive thresholding gives an output image which seems not that bad but tesseract can't read it.

import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract"

img = cv.imread(r"C:\Users\user\Desktop\test_pcs\new2\2022-10-30_14-49-30.jpg",0)
img = cv.medianBlur(img,5)
ret,th1 = cv.threshold(img,127,255,cv.THRESH_BINARY)
#'Adaptive Mean Thresholding'
th2 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_MEAN_C,\
            cv.THRESH_BINARY,11,2)
#'Adaptive Gaussian Thresholding'
th3 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
            cv.THRESH_BINARY,11,2)

images = [img, th2, th3]
for i in range(3):
    plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')

plt.show()

x_start, y_start = int(1110), int(450)
x_end, y_end = int(1690), int(520)
cropped_image = th2[y_start:y_end, x_start:x_end]

plt.imshow(cropped_image,'gray')

text = (pytesseract.image_to_string(cropped_image, config="--psm 7 outputbase digits"))

print("digits: " + text)

I also tried to read the digits character by character but it failed as well.

Now I am trying to get better pictures somehow but the options are quite limited.

I would be greateful for any suggestions as I am doing this for my thesis.

@YvesDaoust I have tried it but the camera can't focus, thus the numbers will be more shady. But thanks for the advice, I will keep trying — BMolics, Oct 31 '22 at 15:54
@Markus Because I am using an IP Camera (Imou bullet 2E) and as I have experienced these are the best pictures that it is capable of. But if you have any suggestions how could I improve the image quaility I am happy to try it. — BMolics, Oct 31 '22 at 15:57
Did you check this approach: https://stackoverflow.com/a/32756570/18667225 ? — Markus, Nov 02 '22 at 07:31

Image recognition difficulties with OCR - reading numbers from a picture

0 Answers0