How to train tesseract-ocr of same digit with three different handwriting?

Question

I made a program in java which reads character from scanned tiff image but accuracy is not that good.

If we change the handwritting in the document then result is little bit different. So Is there any method which train the tesseract-ocr?

I also used jtessEditorBox but nothing is helpful in there.

score 0 · Answer 1 · answered Dec 20 '17 at 13:15

I suggest you to dig this post : http://www.tuxrincon.com/blog/training-tesseract-ocr/

Get pictures for each handwriting. Associate boxes with chars using "QT Box Editor" on several pictures. Then give them to tesseract in order to train it using "train.sh" script (may correct a few mistakes in it). I did not use "train2.sh", because it seems to be counterproductive in my case. Add the all handwritings traineddata file to tesseract config files. You may change "QT Box Editor" configuration to set your handwriting in a different language.

How to train tesseract-ocr of same digit with three different handwriting?

1 Answers1