Threshold of blurry image - part 2

Question

How can I threshold this blurry image to make the digits as clear as possible?

In a previous post, I tried adaptively thresholding a blurry image (left), which resulted in distorted and disconnected digits (right):

enter image description here

Since then, I've tried using a morphological closing operation as described in this post to make the brightness of the image uniform:

enter image description here

If I adaptively threshold this image, I don't get significantly better results. However, because the brightness is approximately uniform, I can now use an ordinary threshold:

enter image description here

This is a lot better than before, but I have two problems:

I had to manually choose the threshold value. Although the closing operation results in uniform brightness, the level of brightness might be different for other images.
Different parts of the image would do better with slight variations in the threshold level. For instance, the 9 and 7 in the top left come out partially faded and should have a lower threshold, while some of the 6s have fused into 8s and should have a higher threshold.

I thought that going back to an adaptive threshold, but with a very large block size (1/9th of the image) would solve both problems. Instead, I end up with a weird "halo effect" where the centre of the image is a lot brighter, but the edges are about the same as the normally-thresholded image:

enter image description here

Edit: remi suggested morphologically opening the thresholded image at the top right of this post. This doesn't work too well. Using elliptical kernels, only a 3x3 is small enough to avoid obliterating the image entirely, and even then there are significant breakages in the digits:

enter image description here

Edit2: mmgp suggested using a Wiener filter to remove blur. I adapted this code for Wiener filtering in OpenCV to OpenCV4Android, but it makes the image even blurrier! Here's the image before (left) and after filtering with my code and a 5x5 kernel:

enter image description here

Here is my adapted code, which filters in-place:

private void wiener(Mat input, int nRows, int nCols) { // I tried nRows=5 and nCols=5

    Mat localMean = new Mat(input.rows(), input.cols(), input.type());
    Mat temp = new Mat(input.rows(), input.cols(), input.type());
    Mat temp2 = new Mat(input.rows(), input.cols(), input.type());

    // Create the kernel for convolution: a constant matrix with nRows rows 
    // and nCols cols, normalized so that the sum of the pixels is 1.
    Mat kernel = new Mat(nRows, nCols, CvType.CV_32F, new Scalar(1.0 / (double) (nRows * nCols)));

    // Get the local mean of the input.  localMean = convolution(input, kernel)
    Imgproc.filter2D(input, localMean, -1, kernel, new Point(nCols/2, nRows/2), 0); 

    // Get the local variance of the input.  localVariance = convolution(input^2, kernel) - localMean^2 
    Core.multiply(input, input, temp);  // temp = input^2
    Imgproc.filter2D(temp, temp, -1, kernel, new Point(nCols/2, nRows/2), 0); // temp = convolution(input^2, kernel)
    Core.multiply(localMean, localMean, temp2); //temp2 = localMean^2
    Core.subtract(temp, temp2, temp); // temp = localVariance = convolution(input^2, kernel) - localMean^2  

    // Estimate the noise as mean(localVariance)
    Scalar noise = Core.mean(temp);

    // Compute the result.  result = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)

    Core.max(temp, noise, temp2); // temp2 = max(localVariance, noise)

    Core.subtract(temp, noise, temp); // temp = localVariance - noise
    Core.max(temp, new Scalar(0), temp); // temp = max(0, localVariance - noise)

    Core.divide(temp, temp2, temp);  // temp = max(0, localVar-noise) / max(localVariance, noise)

    Core.subtract(input, localMean, input);  // input = input - localMean
    Core.multiply(temp, input, input); // input = max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
    Core.add(input, localMean, input); // input = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
}

Here is a different take (if not weird) on your problem: if you are in control of the used font, change it to something better. Where "better" means making it harder to either a 6 or 9 to turn into an 8. Maybe make it bolder too. I guess at some point you will attempt to recognize these digits, and that is the reason for your question. — mmgp, Dec 03 '12 at 02:59
Unfortunately, I'm going to be recognizing these images from Android users' cameras, "in the wild", so there's no control over font. Though that would be a useful solution otherwise. — 1'', Dec 03 '12 at 03:09
Taking the problem in another way: why are you trying to make the digits clearer? Is it to OCR them afterwards? Probably, you can get quite good results by training an OCR with those kind of digits, and using the OCR to detect them in the image. — remi, Dec 05 '12 at 09:09
If you have a good sample of every possible digit the users are going to take a photo from, it might be sensible to directly train from the digits. The way I worded the previous phrase makes it unlikely to be the case. Maybe it could be achievable using one-class classifiers, as One-Class SVM, since you also lack a good representation of what you don't expect to be a digit. Now, it would much easier to train a classifier if you didn't have broken digits or mis-connected ones. After thinning them, the task is much easier and much more prone to give correct results. — mmgp, Dec 05 '12 at 13:01
Check the niblack algorithm. See for example http://stackoverflow.com/questions/9871084/niblack-thresholding — Rob Audenaerde, Dec 07 '12 at 22:56
I've decided to give mmgp's answer the bounty because I think it's the most effective way to solve the problem. On the other hand, I've accepted remi's answer because it gave me the suggestion to use Otsu thresholding, which is what I'm actually going to use (although I'm going to use it on each 3x3 box individually). I may revisit the blind deconvolution idea later if I feel the extra image quality is necessary. Sauvola may work a bit better than my Otsu with segments but it's going to be a whole lot slower unless implemented in C++, and probably isn't worth the trouble then. — 1'', Dec 08 '12 at 22:29
Assuming you know the sudoku cells is a strong prerequisite! you should have stated it in your question. Sauvola is very fast in practice, no matter which language you use, with an implementation based on integral images to compute mean/standard deviation of any block within constant time. Using OpenCV, it can be coded a few lines of code. But OTSU in local portions of your image should be better since it computes the threshold by itself — remi, Dec 10 '12 at 09:24
Otsu is a statistical method that calculates the best threshold that the statistics allow, i.e., that doesn't automatically it better in real tasks. The problem here is a different one, not correctly solved by binarization, you can only expect to get a decent result at best with this kind of method. Now, my answer involves the simplification (which can be removed) of knowing the sudoku cells, and this doesn't change the fact that binarization is still not quite the right thing to do. Deconvolution, or deblurring in certain texts, is the logical step for the problem described. — mmgp, Dec 10 '12 at 13:50
You're probably right that even segmented Sauvola isn't as good because it doesn't pick its threshold. There are "adaptive Sauvola" algorithms out there which might work marginally better, but will be slower. You're probably right that Otsu is worse than blind deconvolution + Otsu, but it's possible to correct for some of the difference by training my SVM on the bad digits, which tend to have certain common characteristics (e.g. all bad 5s look like 6s with the side half chopped off). — 1'', Dec 10 '12 at 16:28
It might be possible to solve with robust descriptors and classifiers, yes. But that is a different problem nevertheless. — mmgp, Dec 10 '12 at 16:59
You probably want local thresholding. there are some general approaches for that. Check the niblack algorithm. See also https://stackoverflow.com/questions/9871084/niblack-thresholding. https://stackoverflow.com/a/9891678/461499 We successfully used this for document segmentation. — Rob Audenaerde, Dec 07 '12 at 22:59

score 6 · Accepted Answer · answered Nov 30 '12 at 08:38

6

Some hints that you might try out:

Apply the morphological opening in your original thresholded image (the one which is noisy at the right of the first picture). You should get rid of most of the background noise and be able to reconnect the digits.
Use a different preprocessing of your original image instead of morpho closing, such as median filter (tends to blur the edges) or bilateral filtering which will preserve better the edges but is slower to compute.
As far as threshold is concerned, you can use CV_OTSU flag in the cv::threshold to determine an optimal value for a global threshold. Local thresholding might still be better, but should work better with the bilateral or median filter

answered Nov 30 '12 at 08:38

remi

3,914
1
19
37

Great idea with the CV_OTSU flag, it works great! Unfortunately, morphological opening doesn't work well on my thresholded image (see my edited post). Also, Astor has [tried median/bilateral filters](http://stackoverflow.com/a/13395241/1397061) before thresholding, and it doesn't work as well as closing + normal threshold. – 1'' Dec 01 '12 at 00:55
probably opening followed by closing will improve the results wrt opening only. – remi Dec 02 '12 at 09:38

score 6 · Answer 2 · answered Dec 01 '12 at 01:35

6

I've tried thresholding each 3x3 box separately, using Otsu's algorithm (CV_OTSU - thanks remi!) to determine an optimal threshold value for each box. This works a bit better than thresholding the entire image, and is probably a bit more robust.

enter image description here

Better solutions are welcome, though.

answered Dec 01 '12 at 01:35

1''

26,823
32
143
200

an efficient binarization for such kind of documents is Sauvola technique (google sauvola + binarization). It is not implemented in OpenCV, but it is quite easy to do so, and you can use integral images to [compute the mean and standard deviation of image patches extremely fast](http://stackoverflow.com/questions/13110733/computing-image-integral/13113234#13113234). – remi Dec 02 '12 at 09:53
I tried sauvola on your image, and I manage to get quite decent results but indeed, as mmgp said, with a fine tuning of the parameters. And probably the set of parameters will work only for this image, but not be optimal for an image with different conditions. – remi Dec 05 '12 at 09:07

score 6 · Answer 3 · answered Dec 06 '12 at 18:01

If you're willing to spend some cycles on it there are de-blurring techniques that could be used to sharpen up the picture prior to processing. Nothing in OpenCV yet but if this is a make-or-break kind of thing you could add it.

There's a bunch of literature on the subject: http://www.cse.cuhk.edu.hk/~leojia/projects/motion_deblurring/index.html http://www.google.com/search?q=motion+deblurring

And some chatter on the OpenCV mailing list: http://tech.groups.yahoo.com/group/OpenCV/message/20938

The weird "halo effect" that you're seeing is likely due to OpenCV assuming black for the color when the adaptive threshold is at/near the edge of the image and the window that it's using "hangs over" the edge into non-image territory. There are ways to correct for this, most likely you would make an temporary image that's at least two full block-sizes taller and wider than the image from the camera. Then copy the camera image into the middle of it. Then set the surrounding "blank" portion of the temp image to be the average color of the image from the camera. Now when you perform the adaptive threshold the data at/near the edges will be much closer to accurate. It won't be perfect since its not a real picture but it will yield better results than the black that OpenCV is assuming is there.

mmgp · Answer 4 · 2012-12-03T15:18:08.683

5

My proposal assumes you can identify the sudoku cells, which I think, is not asking too much. Trying to apply morphological operators (although I really like them) and/or binarization methods as a first step is the wrong way here, in my opinion of course. Your image is at least partially blurry, for whatever reason (original camera angle and/or movement, among other reasons). So what you need is to revert that, by performing a deconvolution. Of course asking for a perfect deconvolution is too much, but we can try some things.

One of these "things" is the Wiener filter, and in Matlab, for instance, the function is named deconvwnr. I noticed the blurry to be in the vertical direction, so we can perform a deconvolution with a vertical kernel of certain length (10 in the following example) and also assume the input is not noise free (assumption of 5%) -- I'm just trying to give a very superficial view here, take it easy. In Matlab, your problem is at least partially solved by doing:

f = imread('some_sudoku_cell.png');
g = deconvwnr(f, fspecial('motion', 10, 90), 0.05));
h = im2bw(g, graythresh(g)); % graythresh is the Otsu method

Here are the results from some of your cells (original, otsu, otsu of region growing, morphological enhanced image, otsu from morphological enhanced image with region growing, otsu of deconvolution):

enter image description here

The enhanced image was produced by performing original + tophat(original) - bottomhat(original) with a flat disk of radius 3. I manually picked the seed point for region growing and manually picked the best threshold.

For empty cells you get weird results (original and otsu of deconvlution):

enter image description here

But I don't think you would have trouble to detect whether a cell is empty or not (the global threshold already solves it).

EDIT:

Added the best results I could get with a different approach: region growing. I also attempted some other approaches, but this was the second best one.

edited Dec 03 '12 at 15:18

answered Dec 03 '12 at 00:19

mmgp

18,901
3
53
80

This is a promising suggestion. By the way, for context, I'm trying to get this to work on Android with mobile phone pictures. Will this method improve image quality with blurring in an arbitrary orientation? Will it improve other types of image defects that you'd be likely to see in mobile phone pictures? – 1'' Dec 03 '12 at 00:49
I will restrict my answer to the blurring part, since other image defects is too broad. What I showed here is a deconvolution with a known estimate of the point spread function (PSF) which is a vertical one. There are methods called blind deconvolution, which will try to guess a proper PSF, likely better than the one I guessed. You should look into these if your interest is in a general approach to "deblurrying". – mmgp Dec 03 '12 at 01:07
I'm mainly looking for something that will consistently give good-quality digits. This may not be worth implementing (OpenCV does not have a built-in implementation) if it will just fix blur, if I can compensate for blur, lighting and other defects with a proper binarization algorithm. What do you think? – 1'' Dec 03 '12 at 01:11
I don't think you can compensate a blur with binarization algorithms, it is a different form of problem. I don't see lighting as a problem here if you already identified the sudoku board. Also, the deconvolution has good chances to provide stronger edges, making it easier for a simpler binarizer. Keep in mind that in Image Processing there is no such thing as "consistently give good-quality for X", where X is any problem being investigated. – mmgp Dec 03 '12 at 01:15
I googled for deconvolution in opencv and found something that might help you: https://github.com/Itseez/opencv/blob/master/samples/python2/deconvolution.py and https://www.ibm.com/developerworks/mydeveloperworks/blogs/theTechTrek/entry/experiments_with_deblurring_using_opencv1?lang=en (didn't read them, but sound helpful). – mmgp Dec 03 '12 at 01:19
I've tried Wiener filtering, and I edited my original question to include my code and results. How is my filtering different from yours? – 1'' Dec 08 '12 at 20:57
Hmm, apparently my code corresponds to the Matlab function `wiener2` rather than `deconvwnr`. – 1'' Dec 08 '12 at 21:00
I will expand the meaning of the second line in the code provided in my answer. First, I didn't use a 5x5 kernel. Instead, I used a vertical line (the value 90 indicates this) of height 10 and width 1. This kernel is actually very simple: [0.05 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.05]'. I also didn't apply it over the whole image. – mmgp Dec 08 '12 at 21:03
Ah, `wiener2` is a lowpass-filter so that is a much greater reason for the results you get. – mmgp Dec 08 '12 at 21:07
Oh, okay, that does look like a low-pass filter now that you mention it :) – 1'' Dec 08 '12 at 21:08
So, back to the drawing board. If this is going to be viable, I need some way to automatically approximate the point-spread function corresponding to the blurring in an image. Is it possible to do this effectively? – 1'' Dec 08 '12 at 21:12
It seems doable to convert `deconvwnr` (code for instance: http://pastebin.com/VuGgrmUj) to OpenCV provided there are good functions regarding FFT. – mmgp Dec 08 '12 at 21:14
For estimating PSF you need to use a blind deconvolution method. – mmgp Dec 08 '12 at 21:15
It looks like Matlab has a function called `deconvblind` which performs blind deconvolution. Have you heard of this function? – 1'' Dec 08 '12 at 21:36
I haven't used it personally, I'm not sure which of the many researcher papers it is based on. – mmgp Dec 08 '12 at 21:50
A more simple approach of deconvolution is to use as 'sharpen' kernel. This is very cheap and can get acceptable results. – Rob Audenaerde Dec 10 '12 at 13:41
The fourth column in the images included in the answer is a sharpening, using morphology, not sure if that was clear or not. – mmgp Dec 10 '12 at 13:55

Threshold of blurry image - part 2

4 Answers4

Linked