3

I have hundreds of images of handwritten notes. They were written from different people but they are in sequence so you know that for example person1 wrote img1.jpg -> img100.jpg. The style of handwriting varies a lot from person to person but there are parts of the notes which are always fixed, I imagine that could help an algorithm (it helps me!).

I tried tesseract and it failed pretty bad at recognizing the text. I'm thinking since each person has like 100 images is there an algorithm I can train by feeding it a small number of examples, like 5 or less and it can learn from that? Or would it not be enough data? From searching around it seems looks like I need to implement a CNN (e.g. this paper).

My knowledge of ai is limited though, is this something that I could still do using a library and some studying? If so, what should I do going forward?

  • 1
    https://www.pyimagesearch.com/2017/07/10/using-tesseract-ocr-python/ Try looking into tesseract OCR – clubby789 Oct 15 '19 at 12:48
  • What you can do, is to take each image and extract each word from it (using a pre-trained CNN) - then combine each word to a sentance. Otherwise, you can look at RNN/LSTM, but I would go with CNN first – CutePoison Oct 15 '19 at 12:50

2 Answers2

1

This is called OCR and there has been a progress. Actually, here is an example of how simple it is to parse an image file to text using tesseract:

try:
    from PIL import Image
except ImportError:
    import Image
import pytesseract


def ocr_core(file):
    text = pytesseract.image_to_string(file)
    return text


print(ocr_core('sample.png'))

BUT

I am not very sure that it can recognize different types of handwriting. You can give it a try yourself to find out. If you want to try the python example you need to import tesseract but first things first to install tesseract on your OS and add it to your PATH.

Kostas Charitidis
  • 2,991
  • 1
  • 12
  • 23
  • I tried your code but it didn't return any text. As a sanity check I tried [something easier](https://imgur.com/a/QyIqrhh) and it [actually gave me something back](https://imgur.com/a/2jki7ao). –  Oct 15 '19 at 13:41
1

There are many OCRs out there and some perform better than others. However, this is a field that has improved a lot recently with the Deep Neural Networks. I would consider using a Cloud provider such as Azure, Google Cloud or Amazon. Your upload the image and they return the metadata.

For instance: https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/

If you don't want to use cloud services for any reason, I would consider using TensorFlow... but some knowledge is required:

Tensorflow model for OCR

Alvaro Arranz
  • 444
  • 2
  • 5
  • I will try `Tensorflow` then as `tesseract` doesn't seem capable of handling this handwritten notes. –  Oct 15 '19 at 13:44