Questions tagged [tesstrain]

6 questions
2
votes
1 answer

Tesseract tesstrain.sh can not find the font

I am trying to train tesseract with the guide of: https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html But even though i did not put a font argument in my command line i get this error: Could not find font named 'Arial Bold'. Pango…
nour
  • 398
  • 3
  • 7
2
votes
0 answers

Tesseract - Training vertical languages

I can't find any documentation on vert traineddata files. What do they contain? Are they the same language model only with different configuration? Should I just be training the basic model with vertical loaded as sublanguage and that would train…
L14n
  • 51
  • 3
2
votes
0 answers

Tesseract tesstrain.sh - Error: jpn_vert is not a valid language code

Trying to run tesstrain.sh for jpn_vert tesstrain.sh --fonts_dir ./tesstutorial --lang jpn_vert --linedata_only --save_box_tiff --langdata_dir ./tesstutorial --fontlist 'Font' --tessdata_dir ./tesstutorial --output_dir ./result and I'm getting…
L14n
  • 51
  • 3
2
votes
0 answers

Tesseract train specific characters

Is it possible to increase the accuracy of specific characters in an existing traineddata model? For example: The number 3 will often be detected as 5. The number 5 will often be detected as 8. The W will often be detected as V. Makes it sense to…
Sean Stayns
  • 4,082
  • 5
  • 25
  • 35
0
votes
1 answer

tesseract 4 Why isn't my training data compiling

I am trying to train Tesseract 4 to recognise some electronic circuit diagram symbols such as a resistor, capacitor etc from images but there seems to be no straight forward guide into training tesseract and the official documentation seems to focus…
-1
votes
1 answer

Evaluation of a trained-on-generated images Tesseract 4 LSTM model against real images

I have trained a Tesseract 4 LSTM model against a set of ~30,000 ground truth images that I generated (as opposed to using "real" images from scanned works, of which I do not have enough to reliably train a model). The model works well (or at least…
Inductiveload
  • 6,094
  • 4
  • 29
  • 55