I am trying to scan a business card using tesseract OCR, all I am doing is sending the image in with no preprocessing, heres the code I am using.
Tesseract* tesseract = [[Tesseract alloc] initWithLanguage:@"eng+ita"];
tesseract.delegate = self;
[tesseract setVariableValue:@"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ@.-()" forKey:@"tessedit_char_whitelist"];
[tesseract setImage:[UIImage imageNamed:@"card.jpg"]]; //image to check
[tesseract recognize];
NSLog(@"Here is the text %@", [tesseract recognizedText]);
As you can see the accuracy is not 100%, which is not what I am concerned about I figure I can fix that with some simple per-processing. However if you notice it mixes the two text blocks at the bottom, which splits up the address, and possibly other information on other cards.
How can I possibly use Leptonica(or something else maybe OpenCV) to group the text somehow? Possibly send regions of text on the image individually to tesseract to scan? I've been stuck on this problem for a while any possible solutions are welcome!