What is the best approach to enhance blacked out areas to make the text inside them readable.?

Question

I am trying to enhance old hand drawn maps which were digitized by scanning and this process has caused some blacked out areas in the image making the text inside them very hard to read.

I tried adaptive histogram equalization and couple of other histogram based approach using MATLAB but nothing gives me the desired result. I could probably lighten the darker shades of grey and make it look a bit better using adaptive histogram equalization but it doesn't really help with the text.

Specifically, I tried adapthisteq() with different variations which is a function available in MATLAB.

Something like this:

A = adapthisteq(I,'NumTiles',X,'clipLimit',0.01,'Distribution','uniform');

... and also tried to change the pixel values directly by having a look at image, something like this :

I(10 > I & I > 0) = 0;   
I(30 > I & I > 10) = 10;
I(255 > I & I > 30) = 255;

Can I enhance the image and get an end result which has only black and white where the lines and text (basically all the information) turns into black (0) and the shades of grey and whiter regions turn into white (255 or 1)?

Is this even possible? If not, how close can I even get to it or what is the best solution to get as close as possible to the desired result. Any help is appreciated.

Here's what the original image looks like:

Here's what the result looks like after I tried out my solution using adaptive histogram equalization:

Can you add the code you already tried and highlight where it failed? That way we can actually try and help you instead of shouting suggestions out of the blue. — Adriaan, Aug 20 '15 at 14:34
Like i said i tried adapthisteq() with different variations which is a function available in matlab. Something like this, A = adapthisteq(I,'NumTiles',X,'clipLimit',0.01,'Distribution','uniform'); and also tried to change the pixel values directly by having a look at image, something like this : % I(10 > I & I > 0) = 0; % I(30 > I & I > 10) = 10; % I(255 > I & I > 30) = 255; . I wanted to know if there's an approach to solve problems like these. — PythonLearner, Aug 20 '15 at 14:48
Take the images, post links to the images, take your code, post example code of what you tried here. — Ander Biguri, Aug 20 '15 at 15:09
There's a link at the bottom of the question above which says 'Here's the link to it' . — PythonLearner, Aug 20 '15 at 16:12
@PythonLearner - I'm in the process of figuring it out. I almost have it... just need to tweak the results. — rayryeng, Aug 20 '15 at 18:22

score 1 · Accepted Answer · edited May 23 '17 at 12:33

Sounds like a classic case of using adaptive thresholding. Adaptive thresholding in a general sense works by taking a look at local image pixel neighbourhoods, compute the mean intensity and seeing if a certain percentage of pixels exceed this mean intensity. If it does, we set the output to white and if not, we set this to black.

One classic approach is to use the Bradley-Roth algorithm.

If you'd like to see an explanation of the algorithm, you can take a look at a previous answer that I wrote up about it:

Bradley Adaptive Thresholding -- Confused (questions)

However, if you want the gist of it, an integral image of the grayscale version of the image is taken first. The integral image is important because it allows you to calculate the sum of pixels within a window in O(1) complexity. However, the calculation of the integral image is usually O(n^2), but you only have to do that once. With the integral image, you scan neighbourhoods of pixels of size s x s and you check to see if the average intensity is less than t% of the actual average within this s x s window then this is pixel classified as the background. If it's larger, then it's classified as being part of the foreground. This is adaptive because the thresholding is done using local pixel neighbourhoods rather than using a global threshold.

On this post: Extract a page from a uniform background in an image, there is MATLAB code I wrote that is an implementation of the Bradley-Roth algorithm, so you're more than welcome to use it.

However, for your image, the parameters I used to get some OK results was s = 12 and t = 25.

After running the algorithm, I get this image:

Be advised that it isn't perfect... but you can start to see some text that you didn't see before. Specifically at the bottom, I see Lemont Library - Built 1948.... and we couldn't see that before in the original image.

Play around with the code and the parameters, read up on the algorithm, and just try things out yourself.

Hope this helps!

@raryeng - Thank You very much, i just read the post and it looks interesting. i will try out your code and work with different parameters and let you know how it goes. Thanks for the Heads up ! — PythonLearner, Aug 20 '15 at 18:49
@PythonLearner - No problem. If you're satisfied, please consider accepting my answer... that's only after you do your testing of course. Good luck! — rayryeng, Aug 20 '15 at 18:50
@raryeng - I tried different variations on this algorithm and yeah the best i could get out of it looks just like the one you have posted. Also there are about a million images like these that i have to enhance. The problem is i can't do them one by one and if I were to run all of them at once, i'd have to define the default s and t values and from my trials on a couple of images, a different value of 's' gives better result for each image. — PythonLearner, Aug 21 '15 at 16:04
@raryeng - But anyways this method has probably been the closest i could get to my solution but i can not go ahead with this for my entire collection and be done with enhancement. I wish there was an algorithm which could just go through each pixel and say 'Hey these pixel's are part of the text, so i'm not gonna do anything and just do whatever process on remaining'. Again thankyou very much, i'm gonna try going ahead to find a better solution, please let me know if you have any suggestion or more inputs on this. I will accept this as it is close to the solution even though it's not perfect. — PythonLearner, Aug 21 '15 at 16:05
@PythonLearner - Ah yes. I understand your dilemma. Thank you for the accept... well, one thing I could suggest is to have some sort of machine learning algorithm. Analyze patches of your image and classify whether there is text or just background. That's another problem and I dare say it's harder. Let me think about that and get back to you... until then, good luck! — rayryeng, Aug 21 '15 at 16:10

What is the best approach to enhance blacked out areas to make the text inside them readable.?

1 Answers1