I'm working on an application that would extract information from invoices that the user takes a picture of with his phone (using flask and pytesseract).
Everything works on the extraction and classification side for my needs, using the image_to_data
method of pytesseract.
But the problem is on the pre-processing side. I refine the image with greyscale filters, binarization, dilation, etc. But sometimes the user will take a picture that has a specific angle, like this: invoice
And then tesseract will return characters that don't make sense, or sometimes it will just return nothing.
At the moment I "scan" the image during pre-processing (I'm largely inspired by this tutorial: https://www.pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/), but it's not efficient at all.
Does anyone know a way to make it easier for tesseract to work on these type of images? If not, should I focus on making this pre-processing scan thing?
Thank you for your help!