1

Thanks for reading me. To sum up my problem/project, I developed a code to take a screenshot, process the image, extract the numbers from it and use them as datas to perform different actions.

My problem comes from the translation of the numbers into data. Most of the time it works, but for some screenshots pytesseract doesn’t recognize the numbers and I think it’s due to the image processing part. Hence, I’m trying to find a way to enhance this part of my code.

  1. Code for the image processing (using OpenCv)

def screen_image_process ():

image = cv2.imread('sct-568x639_180x16.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

cv2.imshow('thresh', thresh)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
cv2.imshow('opening', opening)

cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    area = cv2.contourArea(c)
    if area < 50:
        cv2.drawContours(opening, [c], -1, 0, -1)

result = 255 - opening
result = cv2.GaussianBlur(result, (3, 3), 0)

cv2.imshow('result', result)

2)Code to translate numbers to datas using Pytesseract (it’s comprise inside the precedent procedure, I split it into two parts to make it clearer)

def screen_image_process ():

data = pytesseract.image_to_string(result, lang='eng', config='--psm 6')
print(data)
data = data.split('/')
data[1] = data[1].replace(" ", "")
data[1] = data[1].replace("&", "")
agrsrn = int(data[1])
return agrsrn

Below are two screenshots and the results I got using pytesseract image to string. Hence at the end the “int” function doesn’t work

Screen shot not working no.1:

screen shot not working n.1

Result from pytesseract image to string :

Kakpaienelee)

Screen shot not working no.2:

screen shot not working n.1

Result from pytesseract image to string:

POTOROOOMAcYACUANLORG.0 0)

Do you have an idea on which improvement I could make in the image processing to fix this error ?

Thanks a lot !

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
Alex.A
  • 11
  • 2
  • The numbers on those sample images are white. Try a fixed threshold close to 255, maybe cut at 245, invert the image and give it to tesseract. – stateMachine May 10 '22 at 19:24
  • The text in the variable `result` are in white. The text in the variable `opening` are in black. Try passing `opening` to Tesseract as it works best when darker text is present on light background. – Jeru Luke May 10 '22 at 19:32
  • Additionally you can try to give hints to tesseract which characters are expected as discussed [here](https://stackoverflow.com/q/46574142/18667225). – Markus May 11 '22 at 06:35
  • Thanks everybody for your answers. Actually the fact to put the text in black almost totally solved my problem ! – Alex.A May 12 '22 at 11:28

0 Answers0