3

I am developing an android application that recharge phone with credit by taking picture of the card by phone's camera or from the gallery..I used tesseract library for this purpose to take only the digits using blacklist and whitelist.. it does not work as expected

the picture I used contains these two lines only:

PIN code

41722757649786

the result before starting the recharge activity was:

718 200

41722757649786

I want to recognize only the digits without letters and without using cropper..

  public void initTess(){   

    if (mBaseApi != null)
        mBaseApi.end();     

    mBaseApi = new TessBaseAPI();
    mBaseApi.setDebug(false);

    mBaseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_OSD_ONLY);
    mBaseApi.init(mDataDir + File.separator,"eng");
    mBaseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST,"0123456789");
    mBaseApi.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST,"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmopqrstuvwxyz");


}
taiba
  • 39
  • 1
  • 3
  • thanks for your quick response.. yes the problem is that the text "PIN code" recognized as "718 200" .. I want to not show the letters at all.. Is that possible? – taiba Nov 06 '14 at 15:43

1 Answers1

3

Setting the "tessedit_char_whitelist" variable must be done BEFORE the init, as stated in the FAQ : https://code.google.com/p/tesseract-ocr/wiki/FAQ#How_do_I_recognize_only_digits? This most likely hold true for the blacklist as well.

Therefore, changing your code from this :

mBaseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_OSD_ONLY);
mBaseApi.init(mDataDir + File.separator,"eng");
mBaseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST,"0123456789");
mBaseApi.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST,"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmopqrstuvwxyz");

to this :

mBaseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_OSD_ONLY);
mBaseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST,"0123456789");
mBaseApi.setVariable(TessBaseAPI.VAR_CHAR_BLACKLIST,"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmopqrstuvwxyz");
mBaseApi.init(mDataDir + File.separator,"eng");

should do the trick.

Kaz
  • 75
  • 2
  • 8