I am trying to read text for prices in a video game and am experiencing difficulty in pre-processing the image.
The rest of my code is "complete", as in after the text is extracted I am formatting it and outputting into CSV for later use.
This is what I have come up with so far for the following images, and would like input on other thresholds or pre-processing tools that will make the OCR more accurate.
After gamma, denoise on left - binary threshold on right
As you can see, it is very close but not perfect. I would like to make it more accurate as I will be processing many frames eventually.
Here is my current code:
import cv2
import pytesseract
import pandas as pd
import numpy as np
# Tells pytesseract where the tesseract environment is installed on local computer
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
img = cv2.imread("./image_frames/frame0.png")
# gamma to darken text to be same opacity?
def adjust_gamma(crop_img, gamma=1.0):
# build a lookup table mapping the pixel values [0, 255] to
# their adjusted gamma values
invGamma = 1.0 / gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
# apply gamma correction using the lookup table
return cv2.LUT(crop_img, table)
adjusted = adjust_gamma(crop_img, gamma=0.15)
# grayscale the image
gray = cv2.cvtColor(adjusted, cv2.COLOR_BGR2GRAY)
# denoising image
dst = cv2.fastNlMeansDenoising(gray, None, 10, 10, 10)
# binary threshold
thresh = cv2.threshold(gray, 35, 255, cv2.THRESH_BINARY_INV)[1]
# OCR configurations (3 is default)
config = "--psm 3"
# Just show the image
cv2.imshow("before", gray)
cv2.imshow("before", dst)
cv2.imshow("thresh", thresh)
cv2.waitKey(0)
# Reads text from the image and prints to console
text = pytesseract.image_to_string(thresh, config=config)
# remove double lines
text = text.replace('\n\n','\n')
# remove unicode character
text = text.replace('', '')
print(text)
Any help is appreciated as I am very new to this!