0

I'm trying to create a simpler OCR enginge by using openCV. I have this image: https://dl.dropbox.com/u/63179/opencv/test-image.png

I have saved all possible characters as images and trying to detect this images in input image.

From here I need to identify the code. I have been trying matchTemplate and FAST detection. Both seem to fail (or more likely: I'm doing something wrong).

When I used the matchTemplate method I found the edges of both the input image and the reference images using Sobel. This provide a working result but the accuracy is not good enough.

When using the FAST method it seems like I cant get any interresting descriptions from the cvExtractSURF method.

Any recomendations on the best way to be able to read this kind of code?

UPDATE 1 (2012-03-20)

I have had some progress. I'm trying to find the bounding rects of the characters but the matrix font is killing me. See the samples below:

My font: https://dl.dropbox.com/u/63179/opencv/IMG_0873.PNG

My font filled in: https://dl.dropbox.com/u/63179/opencv/IMG_0875.PNG

Other font: https://dl.dropbox.com/u/63179/opencv/IMG_0874.PNG

As seen in the samples I find the bounding rects for a less complex font and if I can fill in the space between the dots in my font it also works. Is there a way to achieve this with opencv? If I can find the bounding box of each character it would be much more simple to recognize the character.

Any ideas?

Update 2 (2013-03-21)

Ok, I had some luck with finding the bounding boxes. See image: https://dl.dropbox.com/u/63179/opencv/IMG_0891.PNG

I'm not sure where to go from here. I tried to use matchTemplate template but I guess that is not a good option in this case? I guess that is better when searching for the exact match in a bigger picture?

I tried to use surf but when I try to extract the descriptors with cvExtractSURF for each bounding box I get 0 descriptors... Any ideas?

What method would be most appropriate to use to be able to match the bounding box against a reference image?

Magnus O.
  • 468
  • 5
  • 17

1 Answers1

4

You're going the hard way with FASt+SURF, because they were not designed for this task. In particular, FAST detects corner-like features that are ubiquituous iin structure-from-motion but far less present in OCR.

Two suggestions:

  1. maybe build a feature vector from the number and locations of FAST keypoints, I think that oyu can rapidly check if these features are dsicriminant enough, and if yes train a classifier from that
  2. (the one I would choose myself) partition your image samples into smaller squares. Compute only the decsriptor of SURF for each square and concatenate all of them to form the feature vector for a given sample. Then train a classifier with these feature vectors.

Note that option 2 works with any descriptor that you can find in OpenCV (SIFT, SURF, FREAK...).

Answer to update 1

Here is a little trick that senior people taught me when I started. On your image with the dots, you can project your binarized data to the horizontal and vertical axes. By searching for holes (disconnections) in the projected patterns, you are likely to recover almost all the boudnig boxes in your example.

Answer to update 2

At this point, you're back the my initial answer: SURF will be of no good here. Instead, a standard way is to binarize each bounding box (to 0 - 1 depending on background/letter), normalize the bounding boxes to a standard size, and train a classifier from here.

There are several tutorials and blog posts on the web about how to do digit recognition using neural networks or SVM's, you just have to replace digits by your letters.

Your work is almost done! Training and using a classifier is tedious but straightforward.

sansuiso
  • 9,259
  • 1
  • 40
  • 58
  • Ok, I will find smaller and more rects so that I will get one rect around each dot. Then I will measure the distance between the dots and if the distance is small enough I will combine the rects. This way I should be able to find the surrounding rect of almost the entire character. When I have a rect that is close enough I will use this to train the app and I will probably be able to match the letters. Sounds like a good approach? – Magnus O. Mar 20 '13 at 15:12
  • If you blur a bit your image, the small trick that I described in the update should work directly, without having to consider dots distances. Also, putting a prior on the bounding box size can be more robust than what you propose (google for "single linkage" to see the advantages and risks of considering dots distances). – sansuiso Mar 21 '13 at 09:09