17

I've been searching for resources for number recognition in images on the web. I found many links providing lots of resources on that topic. But unfortunately it's more confusing than helping, I don't know where to start.

I've got an image with 5 numbers in it, non-disturbed (no captcha or something like this). The numbers are black on a white background, written in a standard font.

My first step was to separate the numbers. The algorithm I currently use is quite simple, it just checks if a column is entirely white and thus a space. Then it trims each character, so that there is no white border around it. This works quite well.

But now I'm stuck with the actual recognition of the number. I don't know what's the best way of guessing the correct one. I don't think directly comparing to the font is a good idea, because if the numbers only differ a little, it will no more work.

Could anyone give me a hint on how this is done?

It doesn't matter to the question, but I'll be implementing this in C# or Java. I found some libraries which would do the job, but I'd like to implement it myself, to learn something.

tshepang
  • 12,111
  • 21
  • 91
  • 136
svens
  • 11,438
  • 6
  • 36
  • 55

1 Answers1

14

Why not look at using an open source OCR engine such as Tesseract?

http://code.google.com/p/tesseract-ocr/

C# Wrapper for Tesseract

http://www.pixel-technology.com/freeware/tessnet2/

Java Wrapper for Tesseract

http://sourceforge.net/projects/tessocrinjava/

While you might not consider using a third-party library as implementing it yourself, there's a tremendous amount of work that goes into just integrating the third-party tool. Keep in mind also that something that may seem simple (recognizing the number 5 versus the number 6) is often very complex; we're talking thousands and thousands of lines of code complex. In the least, look at the source code for tesseract and it'll give you a good reason to want to leverage a third-party library.

Here's another SO question that'll give you some ideas about the algorithms involved: https://stackoverflow.com/questions/850717/what-are-some-popular-ocr-algorithms

Juan
  • 4,910
  • 3
  • 37
  • 46
Keith Adler
  • 20,880
  • 28
  • 119
  • 189
  • Thanks for the tip. Actually I'm not that good in C/C++ and there's a lot of code. I'm still hoping not having to try to understand a whole OCR software project, just for learning number recognition. – svens Mar 09 '10 at 19:31
  • This will remove the need for you to use C++ ... the C# wrapper is pretty straight-forward. Unless you want to become an expert in machine learning and image optimization you really don't want to try to roll your own OCR solution. – Keith Adler Mar 09 '10 at 19:32
  • 1
    +1 Tesseract is awesome. You can use any language you want as long as you call it on the command line. – rook Mar 09 '10 at 19:37
  • 1
    You can use it as a DLL as well with not much effort so no command line necessary. It comes with this out of the box as they say in their release notes. http://code.google.com/p/tesseract-ocr/wiki/ReleaseNotes – Keith Adler Mar 09 '10 at 19:51