I am trying to develop a python script which can read numbers from pictures, to be more exact I am trying to get the gas consumption. The numbers' locations are always the same. There are two "types" of pics, bright and dark. (I am taking photos every 10 mins so I have a lot of examples if needed)
I would like to get as a result 8 digits. e.g. 10974748 (from the dark pic)
I am mainly using Pytesseract and OpenCV2.
So far the best solution seemes to be that first I crop the needed part of the picture than I use pytesseract.image_to_string()
with config = --psm 7
. But unfortunately it is really not a reliable solution, it can not recognize the same digit combinations when there were no consumption but photos were taken.
import cv2
import numpy as np
import os
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract"
directory = r"C:\Users\user\Desktop\test_pcs\test"
for image in os.listdir(directory):
OriginalImagePath = os.path.join(directory, image)
OriginalImage = cv2.imread(OriginalImagePath)
x_start, y_start = int(1110), int(445)
x_end, y_end = int(1690), int(520)
cropped_image = OriginalImage[y_start:y_end, x_start:x_end]
text = (pytesseract.image_to_string(cropped_image, config="--psm 7 outputbase digits"))
cv2.imshow("Cropped", cropped_image)
cv2.waitKey(0)
print(text + " " + OriginalImagePath)
cv2.destroyAllWindows()
After that I tried using thresholding, but sadly I get worse results than with the simple image_to_string. Adaptive thresholding gives an output image which seems not that bad but tesseract can't read it.
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract"
img = cv.imread(r"C:\Users\user\Desktop\test_pcs\new2\2022-10-30_14-49-30.jpg",0)
img = cv.medianBlur(img,5)
ret,th1 = cv.threshold(img,127,255,cv.THRESH_BINARY)
#'Adaptive Mean Thresholding'
th2 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_MEAN_C,\
cv.THRESH_BINARY,11,2)
#'Adaptive Gaussian Thresholding'
th3 = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv.THRESH_BINARY,11,2)
images = [img, th2, th3]
for i in range(3):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.show()
x_start, y_start = int(1110), int(450)
x_end, y_end = int(1690), int(520)
cropped_image = th2[y_start:y_end, x_start:x_end]
plt.imshow(cropped_image,'gray')
text = (pytesseract.image_to_string(cropped_image, config="--psm 7 outputbase digits"))
print("digits: " + text)
I also tried to read the digits character by character but it failed as well.
Now I am trying to get better pictures somehow but the options are quite limited.
I would be greateful for any suggestions as I am doing this for my thesis.