0

I am trying to do that in Python with tesseract, but it seems to depend on the language to be able to deduce the characters (and that makes sense).

It is a sequence of 14 letters with any of the printable first 800 2-byte utf8 characters, but even if the recognition (OCR) is limited to latin-1 (or less) chars that would be something.

As per this question it seems it does not need proper words, but the installer asks for a training set in a specific language.

ps. To clarify: OCR (at least in academic setting) takes advantage of the context and of a dictionary to help discover difficult letters.

dawid
  • 663
  • 6
  • 12

0 Answers0