4

I'm implementing an ANPR system. I have managed to obtain a clear cut number plate area from a still vehicle image. I have a problem with certain characters when I am feeding them to the OCR engine.

I am using Tesseract and the problem is that characters like "W" and "M" are recognized as "N" and sometimes "U" as "D" and even "O" as "U" or "D" the other way around.

I am wondering whether its something to do with the OCR engine itself or whether I could do something to improve the number plate image. Following is the number plate image:

I am using Aforge.Net framework with C# and Tesseract as the OCR engine. Any advice would be much appreciated.

Mr.Noob
  • 1,005
  • 3
  • 24
  • 58
  • 1
    I appreciate you decided against posting the *full* image (privacy concerns?), but the question needs to show at least *the characters that aren't OCRing* in order to make sense. Assuming the others are OCRed ok, could you trim down the image and edit to show just (eg) the `W` ? – AakashM Aug 10 '12 at 11:18
  • Did you train on the number plate font? Tesseract works better if you do that. Also, you may want to preprocess the image. If the plates are not shot dead on, you may want to undo the perspective skew and stretch the aspect ratio back to normal so the font is consistent with the training. – dvhamme Oct 11 '12 at 11:17

1 Answers1

2

Hi guys I managed to sort this out by doing better preprocessing prior to doing the OCR because my characters seemed to be too distorted. sorry that I have not closed this question :( thanks for your efforts guys really appreciate it

Mr.Noob
  • 1,005
  • 3
  • 24
  • 58