5

Here is my iOS OCR code for number recognition through Tesseract engine:

Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"];

//set the tesseract variables
[tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"];

NSString * temp = @"7";
[tesseract setVariableValue:temp forKey:@"tessedit_pageseg_mode"];

[tesseract setImage:argImage];
[tesseract recognize];
m_convertedText = [[tesseract recognizedText] copy];

Using above, I get some images recognized right. However sometimes I get 5 instead of 8, 6 instead of 5 and so on. My input images are quite perfect - pure black and white after binarizing.

Are there any other Tesseract options that I am missing to specify? I see there are 600+ options and very sparse documentation.

Best I could find was this website which lists all options but not yet very clear for an OCR beginner.

If someone has achieved 100% accuracy with numbers OCR using tesseract it will be really helpful.

Nirav Bhatt
  • 6,940
  • 5
  • 45
  • 89
  • What I found is, result differs between iOS simulator and device. Device being less perfect. Any clues, anyone? – Nirav Bhatt Sep 17 '13 at 16:30
  • You need to go through training process, have you done that? http://www.resolveradiologic.com/blog/2013/01/15/training-tesseract/ – valentt Nov 01 '13 at 21:40

0 Answers0