1

As I asked in my previous question the problem I'm facing is that I have hundreds of images of handwritten notes. They were written from different people but they are in sequence so you know that for example person1 wrote img1.jpg -> img100.jpg. The style of handwriting varies a lot from person to person but there are parts of the notes which are always fixed (maybe that can help an algorithm).

I followed one user suggestion to use tesseract but it couldn't recognize any of the text. The text is not in engligh but I did use the appropriate language data file.

My knowledge of ai is limited but from searching and looking at some papers it looks like this could be done with a CNN. Can someone guide as to what I should do from here? I'd like to go forward with the project but I also don't have a lot of time to learn about neural networks. How challenging is it to implement one that solves this task?

1 Answers1

2

I wouldn't use tesseract for handwriting recognition. You can train tesseract for handwriting recognition but out of the box it works well for printet text and a lot of fonts and languages.

Here are two links how to train it yourself:

I had better results with Amazon Recognition: https://aws.amazon.com/en/recognition I would like to have a offline java library for it but didn't found any yet. My next step will be to try ABBYY services because they can also focus on seperated handwritten characters: https://abbyy.technology/en:features:ocr:icr

Update

If somebody find a library or good service even years later I would be happy to see them in the comments.

timguy
  • 2,063
  • 2
  • 21
  • 40