3

I need to know how to clean noise from an image with Matlab.

lets look at this example:

enter image description here enter image description here

as you see the numbers is not look clearly.

so how can I clean the noise and the pixels that are not the numbers so the identification will be easier.

thanks.

Andrey Rubshtein
  • 20,795
  • 11
  • 69
  • 104
Ofir A.
  • 3,112
  • 11
  • 57
  • 83
  • Your picture is very noisy. You may have a great advantage if the font is always the same and known. If you have it, post an image with the numbers scaled the same size of the image you already posted to see if that is enough to get a kickstart – Dr. belisarius Mar 15 '11 at 15:08
  • @belisarius thanks, but I didn't understand what you are saying here. yes the font is always the same but how does it help me. – Ofir A. Mar 15 '11 at 21:48
  • Post an image with all the numbers in that font with the same scale that your image above – Dr. belisarius Mar 15 '11 at 23:29

4 Answers4

23

Let's do it step by step in Mathematica:

(*first separate the image in HSB channels*)
i1 = ColorSeparate[ColorNegate@yourColorImage, "HSB"]

enter image description here

(*Let's keep the B Channel*)
i2 = i1[[3]]

enter image description here

(*And Binarize it *)
i3 = Binarize[i2, 0.92]

enter image description here

(*Perform a Thinning to get the skeleton*)
i4 = Thinning[i3]

enter image description here

(*Now we cut those hairs*)
i5 = Pruning[i4, 10]

enter image description here

(*Remove the small lines*)
i6 = DeleteSmallComponents[i5, 30]

enter image description here

(*And finally dilate*)
i7 = Dilation[i6, 3]

enter image description here

(*Now we can perform an OCR*)
TextRecognize@i7
-->"93 269 23"  

Done!

Dr. belisarius
  • 60,527
  • 15
  • 115
  • 190
18

Since this question is tagged MATLAB, I translated @belisarius's solution as such (which I think is superior to the currently accepted answer):

%# read image
I = imread('https://i.stack.imgur.com/nGNGf.png');

%# complement it, and convert to HSV colorspace
hsv = rgb2hsv(imcomplement(I));
I1 = hsv(:,:,3);                %# work with V channel

%# Binarize/threshold image
I2 = im2bw(I1, 0.92);

%# Perform morphological thinning to get the skeleton
I3 = bwmorph(I2, 'thin',Inf);

%# prune the skeleton (remove small branches at the endpoints)
I4 = bwmorph(I3, 'spur', 7);

%# Remove small components
I5 = bwareaopen(I4, 30);

%# dilate image
I6 = imdilate(I5, strel('square',2*3+1));

%# show step-by-step results
figure('Position',[200 150 700 700])
subplot(711), imshow(I)
subplot(712), imshow(I1)
subplot(713), imshow(I2)
subplot(714), imshow(I3)
subplot(715), imshow(I4)
subplot(716), imshow(I5)
subplot(717), imshow(I6)

enter image description here

Finally you can apply some form of OCR to recognize the numbers. Unfortunately, MATLAB has no built-in function equivalent to TextRecognize[] in Mathematica... In the meanwhile, look in the File Exchange, I'm sure you will find dozens of submissions filling the gap :)

Community
  • 1
  • 1
Amro
  • 123,847
  • 25
  • 243
  • 454
  • your solution is great but it's not working on the other cases while misha's solution did work. thanks. – Ofir A. Jul 04 '11 at 21:28
  • 2
    @Michael: obviously in image processing, there is no one solution fit all. So you would want to tune the parameters to fit your special cases (like binary threshold, dilation size, etc...) – Amro Jul 04 '11 at 21:43
  • 3
    @Michael Also, and perhaps more important, you should provide a set of test cases when posting an image processing problem. That is the only way the rest of us could take care of your universe – Dr. belisarius Jul 05 '11 at 18:58
7

Did you start with a bilevel (two color, black and white)? Or did you threshold it yourself?

If it's the latter, you may find it easier to perform noise reduction before you threshold. In this case, please upload the image you have before thresholding.

If it's the former, then you'll have a tough time as traditional noise reduction is concerned. The reason is that a lot of noise reduction approaches take advantage of the distinction in statistical properties between the noise and the actual natural image. By thresholding, that distinction is essentially destroyed.

EDIT

OK, technically, your image isn't really noisy -- it's blurry (letters are running into each other) and has background interference.

But anyway, here is how I would deal with it:

  • Pick a color channel to work with (RGB is three channels, typically one is enough). I chose green because it looked the easiest to manipulate.
  • Blur the image (I used a 5x5 Gaussian kernel in GIMP)
  • Threshold using an empirically determined threshold (basically, try each threshold until you get a decent result). It's OK if some of the numbers have gaps -- we can close them in the next step
  • Morphological image processing (erosion and dilation)

Green channel:

enter image description here

Blur (5x5 Gaussian):

enter image description here

Thresholded image (I used a threshold of ~93 in GIMP):

enter image description here

Final result:

enter image description here

You can see that the gaps in the middle 6 and 9 have dissapeared. Unfortunately, I couldn't get the gap in the left 3 to go away -- it's simply too large. Here's what the problems causing this are:

  • The line along the top of the image is much darker than some parts of the 3. If you use a threshold to remove the line, then a gap will be created. If you were to somehow remove that line (e.g. by more zealous cropping), the thresholding result would be much better as far as the 3 is concerned.
  • Also, the middle 2 and 6 are running together. Heavy thresholding is required to prevent them from both forming the same blob after thresholding.
mpenkov
  • 21,621
  • 10
  • 84
  • 126
  • @misha thanks for your reply. the original image is in RGB scale, i converted it to black and white. how does it help me if me image is in RGB scale instead black and white. thanks again. – Ofir A. Mar 16 '11 at 10:33
  • If your original image is RGB, then you can do things like color channel manipulation and adaptive median filtering to reduce the effect of the noise. It will be easier to explain if you post your RGB image, so please upload it and let me know. – mpenkov Mar 16 '11 at 12:24
  • @misha I'm really appreciate your help. I'm at work right now, so I will post the RGB image later. I have it on my home computer. thanks a lot. – Ofir A. Mar 16 '11 at 14:38
  • @misha how do I pick a color to work on? and how I need to use 5x5 Gaussian to blur the image? as you see my Matlab skills is not very high so if you can elaborate more it will be great. any how thanks a lot for your help. – Ofir A. Mar 17 '11 at 08:51
  • You can pick the color channel by visual inspection. Green is usually a safe choice because it is the dominant color when conversion to grayscale is performed (humans eyes are most sensitive to green). As for Gaussian filtering, try this link: http://www.dsprelated.com/groups/matlab/show/2521.php – mpenkov Mar 17 '11 at 09:38
0

I think there are two things you could aim to do to make them more detectable:

  1. Remove patches smaller than a certain number of pixels (this would remove the spots between the sets of digits)
  2. Numbers should be "closed" forms, so you need an algorithm to detect the pixels (at the top of each number) that should be changed to black in order to "close" the number "shapes".

You also have linear features that are a part of the noise signal which could be detected through edge / line detection.

Detecting contiguous "zones" and calculating characteristics such as compactness or length / height might also help in identifying which structures to keep...

Benjamin
  • 11,560
  • 13
  • 70
  • 119
  • I'm not Matlab expert so if you can be more specific I'll really appreciate it. which function should I use to Remove patches smaller than a certain number of pixels. thanks. – Ofir A. Mar 15 '11 at 13:51
  • @michael beware that removing small elements will only take care of the dots between the numbers. The rest of the noise is the biggest problem and requires more thought. – Dr. belisarius Mar 15 '11 at 14:17