1

I'm trying to make a program that can read the information off of a nutritional label but Tesseract is having lots of issues actually being able to read anything. I've tried a number of different Image processing techniques using OpenCV but not much seems to help.

Here are some of my better looking attempts (which happen to be the simplest):

Tango bottle label uneditied

Tango bottle label edited

Output:

200k], Saturates, 09

Irn Bru bottle label unedited

Irn Bru bottle label edited

Output

This is just changing the images to grey scale, a 3x3 Gaussian blur and Otsu binarisation.

I would appreciate any help on how to make the text more readable using OpenCV or any other image processing library.

Would it be simpler to forego using Tesseract and use machine learning for this?

Masonator
  • 121
  • 1
  • 6

1 Answers1

2

First of all read this StackOverflow Answer regarding OCR prepossessing.

The most important steps described above are the Image Binarization and Image Denoising

Here is an example:

Original Image

Original

Grey Scale

Grey Scale

Unsharp Masking

Unsharp Mark

Binarization

Binarization

Now ready to apply OCR

JAVA code

Imgproc.cvtColor(original, grey, Imgproc.COLOR_RGB2GRAY, 0);

Imgproc.GaussianBlur(grey, blur, new Size(0, 0), 3);

Core.addWeighted(blur, 1.5, unsharp, -0.5, 0, unsharp);

Imgproc.threshold(unsharp,binary,127,255,Imgproc.THRESH_BINARY);

MatOfInt params = new MatOfInt(Imgcodecs.CV_IMWRITE_PNG_COMPRESSION);
File ocrImage = new File("ocrImage.png");
Imgcodecs.imwrite(ocrImage,binary,params);

/*initialize OCR ...*/
lept.PIX image = pixRead(ocrImage);
api.SetImage(image);
String ocrOutput = api.GetUTF8Text();
Marv
  • 3,517
  • 2
  • 22
  • 47
arxakoulini
  • 723
  • 7
  • 22