Cleaning up captcha image

Question

I'm trying to clean up the image above I've tried several different methods using open cv, I either erode the original image too much to the point where parts of the letters become missing such as below:

I'm not really sure sure how to get rid of the last diagonal line and repair the S, my code so far is:

import cv2 
import matplotlib.pylab as plt
img = cv2.imread('/captcha_3blHDdS.png')

#make image gray 
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

#Blur
blur = cv2.GaussianBlur(gray,(5,5),0)
bilateral = cv2.bilateralFilter(gray,5,75,75)

#Thresholding
ret, thresh = cv2.threshold(bilateral,25,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

#Kernal
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))

#other things
erosion = cv2.erode(thresh,kernel,iterations = 1)
closing = cv2.morphologyEx(erosion, cv2.MORPH_CLOSE, kernel, iterations = 1)

#Transform image
dist_transform = cv2.distanceTransform(closing,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.02*dist_transform.max(),255,cv2.THRESH_BINARY)#,255,0)

#kernel_1
kernel_1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 2))

dilation_1 = cv2.dilate(sure_fg,kernel_1,iterations = 2)
erosion_1 = cv2.erode(dilation_1,kernel_1,iterations = 3)

plt.imshow(erosion_1, 'gray')

Any help would be greatly appreciated, Here are more examples of the type of images that are produced from the captcha;

also heres the link to a folder containing the images

"hopefully this will become a first step towards me seeing how computer vision coupled with deep learning will bring about a change in online text captcha implementation" it already is, in the opposite way: reCAPTCHA now has humans annotate image data for ML truthing, and Google hardly uses plain text captchas anymore. — alkasm, Jun 27 '17 at 19:13
Thats a 100% true, i think it's a good way to ewse myself into deep lewrning and opencv plus quite a few places especially Wordpress still offer text based captcha — user3191569, Jun 27 '17 at 19:22
I think it would be really valuable to have a decent sized set of sample input images, since it's easy to hardcode something to work fairly well with the one you show, but fail miserably on anything else. See if you can create a mosaic of say 40 (4x10) or 80 (4x20) captchas. — Dan Mašek, Jun 27 '17 at 23:07
@DanMašek 4 of the same type i.e. same letters but different ordering? or do you mean 40 different captcha images. — user3191569, Jun 28 '17 at 15:09
@user3191569 Different ones. I just meant to arrange them in a grid (sort of like i did [here](https://stackoverflow.com/questions/36254452/counting-cars-opencv-python-issue/36274515#36274515)), since that's a bit more practical to embed in your question (rather than a long thing noodle, or a pile of tiny separate pictures). — Dan Mašek, Jun 28 '17 at 16:44
I have found this answer to be very helpful: https://dsp.stackexchange.com/questions/52089/removing-noisy-lines-from-image-opencv-python — abcd1234, Jul 05 '19 at 20:40
Can you provide the source of the image or the generator that was used for these images? — Phlogi, Jan 02 '22 at 11:20

Simon Mourier · Accepted Answer · 2017-08-08T08:47:42.213

Here is a C# solution using OpenCvSharp (which should be easy to convert back to python/c++ because the method names are exactly the same).

It uses OpenCV's inpainting technique to avoid destroying too much of the letters before possibly running an OCR phase. We can see that the lines have a different color than the rest, so we'll use that information very early, before any grayscaling/blackwhiting. Steps are as follow:

build a mask from the lines using their color (#707070)
dilate that mask a bit because the lines may have been drawn with antialiasing
repaint ("inpaint") the original image using this mask, which will remove the lines while preserving most of what was below the lines (letters). Note we could remove the small points before that step, I think it would be even better
apply some dilate/blur/threshold to finalize

Here is the mask:

Here is the result:

Here is the result on sample set:

Here is the C# code:

static void Decaptcha(string filePath)
{
    // load the file
    using (var src = new Mat(filePath))
    {
        using (var binaryMask = new Mat())
        {
            // lines color is different than text
            var linesColor = Scalar.FromRgb(0x70, 0x70, 0x70);

            // build a mask of lines
            Cv2.InRange(src, linesColor, linesColor, binaryMask);
            using (var masked = new Mat())
            {
                // build the corresponding image
                // dilate lines a bit because aliasing may have filtered borders too much during masking
                src.CopyTo(masked, binaryMask);
                int linesDilate = 3;
                using (var element = Cv2.GetStructuringElement(MorphShapes.Ellipse, new Size(linesDilate, linesDilate)))
                {
                    Cv2.Dilate(masked, masked, element);
                }

                // convert mask to grayscale
                Cv2.CvtColor(masked, masked, ColorConversionCodes.BGR2GRAY);
                using (var dst = src.EmptyClone())
                {
                    // repaint big lines
                    Cv2.Inpaint(src, masked, dst, 3, InpaintMethod.NS);

                    // destroy small lines
                    linesDilate = 2;
                    using (var element = Cv2.GetStructuringElement(MorphShapes.Ellipse, new Size(linesDilate, linesDilate)))
                    {
                        Cv2.Dilate(dst, dst, element);
                    }

                    Cv2.GaussianBlur(dst, dst, new Size(5, 5), 0);
                    using (var dst2 = dst.BilateralFilter(5, 75, 75))
                    {
                        // basically make it B&W
                        Cv2.CvtColor(dst2, dst2, ColorConversionCodes.BGR2GRAY);
                        Cv2.Threshold(dst2, dst2, 255, 255, ThresholdTypes.Otsu);

                        // save the file
                        dst2.SaveImage(Path.Combine(
                            Path.GetDirectoryName(filePath),
                            Path.GetFileNameWithoutExtension(filePath) + "_dst" + Path.GetExtension(filePath)));
                    }
                }
            }
        }
    }
}

Elegant solution indeed! If anyone needs the same implementation in python, I have created a gist - https://gist.github.com/nirajpandkar/887ac6e95db3920382718095dc82e582 — Niraj Pandkar, Sep 05 '18 at 05:14

score 7 · Answer 2 · answered Jun 27 '17 at 19:33

7

Take a closer look to your captcha. most of the dust in that image has a different grayscale value than the text.

The text is in 140 and the dust is in 112.

A simple grayscale filtering will help a lot here.

from scipy.misc import imread, imsave
import numpy as np

infile = "A1nO4.png"
outfile = "A1nO4_out.png"

im = imread(infile, True)
out_im = np.ones(im.shape) * 255

out_im[im == 140] = 0

imsave(outfile, out_im)

Now use cv2.dilate (cv2.erode on a white on black text) to get rid of the remaining dust.

answered Jun 27 '17 at 19:33

Ghilas BELHADJ

13,412
10
59
99

2

Wouldn't that create separate regions that will cause issue for when I attempt `findContours ` and `boundingRect ` – user3191569 Jun 27 '17 at 20:38
2

Yes, you cannot use `findContours` and `boundingRect` to get the letters, but you can use them at least to remove the small blobs. – Ghilas BELHADJ Jun 27 '17 at 20:46

flamelite · Answer 3 · 2018-01-07T05:40:26.620

This is not a very robust solution but it might be help full in most of the cases:

By seeing the image sample posted above i can observe one common feature about the diagonal lines that they either start or end at the image edges while the text which we are interested in are in the middle so in this way we can determine the pixel values of those diagonal lines by searching them in the first and last few rows and columns of the image matrix and eliminate them as noise. And this approach also might be less time costly.

Cleaning up captcha image

3 Answers3

Linked