Image preprocessing for text recognition

Question

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?

I've included two sample images here.

Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.

The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.

Thanks

could you please upload those sample images here? link gives 403 forbidden — Alupotha, Jan 29 '17 at 01:34

karlphillip · Accepted Answer · 2019-07-31T13:43:14.813

There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.

However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.

Also, there are other interesting posts about OCR and OpenCV that you should take a look:

Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:

cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);

double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);

cv::imwrite("inv_thres.png", new_img);

I guess I'll have to find the right set of erode/dilate operations for each image. Right now, I can't seem to find a combination that works reasonably well for all images. The watershed example works best, though. — Osiris, Jul 17 '12 at 03:51
The second OpenCV link is dead, here is an alternative: https://github.com/damiles/basicOCR — yurez, Dec 16 '15 at 14:05

score 2 · Answer 2 · answered Jul 13 '12 at 09:20

2

Try morphological image processing. Have a look at this. However, it works only on binary images - so you will have to binarize the image( threshold?). Although, it is simple, it is dependent on font size, so one structure element will not work for all font sizes. If you want a generic solution, there are a number of papers for text detection in images - A search of this term in google scholar should provide you with some useful publications.

answered Jul 13 '12 at 09:20

go4sri

1,490
2
15
29

Thanks, that paper is going to be really useful. I looked at morphological operations, but, as you said, they are dependent on the text size. – Osiris Jul 17 '12 at 03:47

Image preprocessing for text recognition

2 Answers2

Linked