0

I'm trying to create a traineddata file to train tesseract how to read the images I will feed it but I don't understand what to include in the font_properties step. I'm following this example and the answer to this post. Both examples only put 0 and 1 as values for font_properties and my traineddata file is for specific alphanumeric values. Would you tell me more about what to include in step 3 in the second link I sent you. Can it be anything, is it just like a plain description for the font or is it actually important and needs to be accurate.

Clint Theron
  • 59
  • 1
  • 9

2 Answers2

1

Each line of the font_properties file is formatted as follows: fontname italic bold fixed serif fraktur where fontname is a string naming the font (no spaces allowed!), and italic, bold, fixed, serif and fraktur are all simple 0 or 1 flags indicating whether the font has the named property.

Example:

timesitalic 1 0 0 1 0

https://tesseract-ocr.github.io/tessdoc/tess3/Training-Tesseract-3.03%E2%80%933.05.html#set_unicharset_properties

nguyenq
  • 8,212
  • 1
  • 16
  • 16
0

Oh, I get it now. 1 is for yes and 0 for no. I was thinking about it different. I understand now that, for instance, if I the font is bold I would give bold a value of 1.

Clint Theron
  • 59
  • 1
  • 9