9

I want to do OCR on this image.enter image description here This is pre-define format. ie first five will characters, then next four will be digits and last will be character.

When I execute following command

$ tesseract in.png stdout

I get output as BDVPD474SQ

So, I went for user-pattern. I created a file(in directory /usr/share/tesseract-ocr/tessdata/configs) named as bazaar (its content is as follow)

load_system_dawg     F
load_freq_dawg       F
user_patterns_suffix user-patterns

I also created a file, named as eng.user-patterns in directory /usr/share/tesseract-ocr/tessdata (its content is as follow)

\A\A\A\A\A\d\d\d\d\A

Still, I am getting same result

$ tesseract in.png stdout bazaar
BDVPD474SQ

What I am doing wrong ? Has anyone accomplished this by Tess4j ?

Jo Oko
  • 358
  • 5
  • 10
Bhushan
  • 1,489
  • 3
  • 27
  • 45
  • I had to delete my post, since it was obviously wrong. I looked at the source ( https://code.google.com/p/tesseract-ocr/source/browse/dict/trie.h ), which proves your pattern correct. Also I tried your example and got the same result. – Jo Oko Nov 05 '15 at 09:22
  • 2
    @JoOko So can we say that, this feature is not implemented ? – Bhushan Nov 05 '15 at 10:15
  • 3
    And still seems to be the case all these years later? :\ – jtlz2 Sep 12 '19 at 07:14
  • still facing same issue, there isn't much on how to use it – Muhammad Uzair Oct 08 '22 at 16:56

0 Answers0