27

I´m a beginner on computer vision, but I know how to use some functions on opencv. I´m tryng to use Opencv for Document Recognition, I want a help to find the steps for it.

I´m thinking to use opencv example find_obj.cpp , but the documents, for example passport, has some variables, name, birthdate, pictures. So, I need a help to define the steps for it, and if is possible how function I have to use on the steps.

I'm not asking a whole code, but if anyone has any example link or you can just type a walkthrough, it is of great help.

Ricardo Cunha
  • 2,013
  • 6
  • 24
  • 42

1 Answers1

38

There are two very different steps involved here. One is detecting your object, and the other is analyzing it.

For object detection, you're just trying to figure out whether the object is in the frame, and approximately where it's located. The OpenCv features framework is great for this. For some tutorials and comprehensive sample code, see the OpenCv features2d tutorials and especially the feature matching tutorial.

For analysis, you need to dig into optical character recognition (OCR). OpenCv does not include OCR libraries, but I recommend checking out tesseract-ocr, which is a great OCR library. If your documents have a fixed structured (consistent layout of text fields) then tesseract-ocr is all you need. For more advanced analysis checking out ocropus, which uses tesseract-ocr but adds layout analysis.

Kyle McDonald
  • 1,171
  • 2
  • 11
  • 17
  • I try this solution, but if I have I haven´t success on real world example, I think with template math I have to use only image with same resolution. Or not? – Ricardo Cunha Sep 30 '11 at 00:32
  • 2
    If you're having trouble with a real world example, you might need to train tesseract-ocr for the specific font that you're using. Otherwise it's going to be using it's default database and that might not match the text you're working with. You might try scaling your text before you feed it to tesseract-ocr, I found a height around 20 px works well. – Kyle McDonald Sep 30 '11 at 08:43
  • Do you have a link on how to train the tesseract? I'm having trouble getting good results and cannot find an understandable, good tutorial on how to do the training – Tjorriemorrie Apr 24 '17 at 00:43
  • 1
    @Tjorriemorrie Choose your version for [training instructions](https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract) – eshirima Jun 01 '17 at 13:59