0

I would like to use tesseract to recognize only digits.So I set tesseract like following:

  tesseract::TessBaseAPI tess;
  tess.SetVariable("tessedit_char_whitelist","0123456789");
  tess.Init(tessdata, "eng", tesseract::OEM_DEFAULT);
  tess.SetImage((uchar*)im.data, im.size().width, im.size().height, im.channels(), im.step1());
  const char* out = tess.GetUTF8Text();

But letters still appeared in the result. I'm new to tesseract, anyone could help to figure out my problem? Thank you.

By the way, the image is a little rotated.

ysfseu
  • 666
  • 1
  • 10
  • 20
  • this might be useful http://stackoverflow.com/questions/4944830/how-to-make-tesseract-to-recognize-only-numbers-when-they-are-mixed-with-letter – Kumar Saurabh Nov 19 '15 at 16:21
  • @KumarSaurabh Does this mean that the low resemblance will be recognized letters directly even if the only digit has been set? But I think that it's more reasonable to recognize low resemblance as wrong digits. – ysfseu Nov 28 '15 at 23:27

1 Answers1

3

I'm not user if you've figured out of it, but your problem might be that you are calling SetVariable() before Init()? Looking at SetVariable() in baseapi.h the wording is kinda self-contradictory:

SetVariable may be used before Init, but settings will revert to defaults on End(). Note: Must be called after Init(). Only works for non-init variables (init variables should be passed to Init()).

So I'd suggest you give a simple reordering a go and see how that works out.

Disclaimer: I haven't tested, so I don't know if that's the issue.

thomhell
  • 31
  • 2