I use tesseract-OCR to extract text from scanned images, For few images text is not properly recognized due to low resolution and output produced is some irrelevant characters.
Techniques applied:
Increase the dpi to 300.
Image pre- processing techniques in opencv.
Upscaling of images using dnn_superres in opencv
Noise removal techniques.
Refereed git repos where super-resolution algorithm model is developed using Deep learning.
Improve tesseract-ocr quality by training tessdata.
Reference Links:
Sample Image:
Is there any simple way in python to improve the text without using any Deep learning model.