1

I'm working on an application that would extract information from invoices that the user takes a picture of with his phone (using flask and pytesseract).

Everything works on the extraction and classification side for my needs, using the image_to_data method of pytesseract.

But the problem is on the pre-processing side. I refine the image with greyscale filters, binarization, dilation, etc. But sometimes the user will take a picture that has a specific angle, like this: invoice

And then tesseract will return characters that don't make sense, or sometimes it will just return nothing.

At the moment I "scan" the image during pre-processing (I'm largely inspired by this tutorial: https://www.pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/), but it's not efficient at all.

Does anyone know a way to make it easier for tesseract to work on these type of images? If not, should I focus on making this pre-processing scan thing?

Thank you for your help!

Axel
  • 11
  • 1
  • I'm finding the writing on that receipt hard to read myself. Not sure if that's a thumbnail of the image you intend to process, or the actual user-submitted image. If the former, I'd upload a full res image so people here can assist you. If the latter, I'd probably request the user re-uploads with better resolution! Regarding the angle, perhaps [this thread](https://stackoverflow.com/questions/57964634/python-opencv-skew-correction-for-ocr) covers what you are trying to do? – v25 Dec 09 '21 at 15:25

0 Answers0