My goal is to detect the characters on images of this kind.
I need to improve the image so that Tesseract does a better recognition, probably by doing the following steps:
- Rotate the image so that the blue rectangle is horizontal [Need help on this]
- Crop the image according to the blue rectangle [Need help on this]
- Apply a thresholding filter and a gaussian blur
Use Tesseract to detect the characters
img = Image.open('grid.jpg') image = np.array(img.convert("RGB"))[:, :, ::-1].copy() # Need to rotate the image here and fill the blanks # Need to crop the image here # Gray the image gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Otsu's thresholding ret3, th3 = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) # Gaussian Blur blur = cv2.GaussianBlur(th3, (5, 5), 0) # Save the image cv2.imwrite("preproccessed.jpg", blur) # Apply the OCR pytesseract.pytesseract.tesseract_cmd = r'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe' tessdata_dir_config = r'--tessdata-dir "C:/Program Files (x86)/Tesseract-OCR/tessdata" --psm 6' preprocessed = Image.open('preproccessed.jpg') boxes = pytesseract.image_to_data(preprocessed, config=tessdata_dir_config)
Here is the output image I get which is not perfect for the OCR:
OCR problems:
- The blue rectangle is sometimes recognized as characters, this is why I would like to crop the image
- Sometimes Tesseract recognizes the characters on a line as a word (GCVDRTEUQCEBURSIDEEC) and some other times as individual letters. I would like it to be always a word.
- The little pyramid at the bottom right is recognized as a character
Any other suggestions to improve the recognition are welcome