3

I'm using tessseract 4 on ubuntu 16.04. so when using hocr feature in tesseract and after activating font info in hocr config file (hocr_font_info 1) I'm still not getting " x_font "info.

Is there any other way to enable font info in tesseract4?

hamma
  • 129
  • 2
  • 14
  • Stack Overflow is a site for programming and development questions. This question appears to be off-topic because it is not about programming or development. See [What topics can I ask about here](http://stackoverflow.com/help/on-topic) in the Help Center. Perhaps [Super User](http://superuser.com/) or [Unix & Linux Stack Exchange](http://unix.stackexchange.com/) would be a better place to ask. Also see [Where do I post questions about Dev Ops?](http://meta.stackexchange.com/q/134306) – jww Jun 15 '17 at 17:36
  • 1
    Do you use the LSTM or original recognition method (--oem parameter)? The text font recognition feature will not work with the neural net LSTM engine, see also https://github.com/tesseract-ocr/tesseract/issues/684 . – zuphilip Jun 16 '17 at 08:21
  • I wasn't enabling (--oem parameter) so I don't think that issue applies to me – hamma Jun 16 '17 at 08:34
  • I suppose it's too late to comment, but in case: I think zuphilip was suggesting setting the --oem parameter to 0, which causes tesseract v4 to use the same engine as in tesseract 3. That engine attempts to detect bold and italic, tagging them with 'strong' and 'em' respectively. The default v4 engine does not support this. – Mike Maxwell May 17 '21 at 22:49

0 Answers0