0

I am trying to process the captcha image.I want to process different type of captcha image to extract actual text.I am using open cv to do this job it is working fine but the problem is I need to set different lower threshold to convert greyscale to binary image.

Main goal : I am trying to remove that horizontal line and make the character clear to read

Code used:

import cv2

# Load an color image in grayscale
img = cv2.imread('it_captcha3.jpg',0)
ret, thresh_img = cv2.threshold(img, 180, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('grey image',thresh_img)
cv2.imwrite("result11.jpg", thresh_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Captcha1 :

enter image description here

Processed image : threshold used -> low: 180 high ->255,

enter image description here

CAPTCHA : 2

enter image description here

PROCESSED IMAGE: threshold used -> low: 200 high ->255

enter image description here

captcha : 3

enter image description here

processed : low -> 165 high : 255

enter image description here

Jeya Kumar
  • 1,002
  • 1
  • 13
  • 36
  • Look at opencv documentation about [adaptive thresholding](https://docs.opencv.org/2.4/modules/imgproc/doc/miscellaneous_transformations.html?highlight=threshold#void%20adaptiveThreshold(InputArray%20src,%20OutputArray%20dst,%20double%20maxValue,%20int%20adaptiveMethod,%20int%20thresholdType,%20int%20blockSize,%20double%20C)) – M. Doosti Lakhani Sep 18 '18 at 13:26
  • Thanks for suggestion i guess problem is i need to change that threshold based on color of background and characters. Not sure how to achieve this.I tried https://stackoverflow.com/questions/23260345/opencv-binary-adaptive-threshold-ocr but still same problem every image behave differently – Jeya Kumar Sep 18 '18 at 13:38
  • Normalizing the colors of the grayscale image should help standardize the intensity values you have to work with. Have you tried using [cv2.normalize](https://docs.opencv.org/2.4/modules/core/doc/operations_on_arrays.html?highlight=mean#normalize)? – Mason McGough Sep 18 '18 at 14:23
  • @MasonMcGough : Hi i tried normalize as this img = cv2.normalize(img, None, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F) but after this i cant apply threshold. for all rage i am getting plain white image – Jeya Kumar Sep 18 '18 at 15:06
  • I think you should build your own strategy. One idea is you can use histogram of grayscale image, then put some distribution models on it. But I do not know what method approved to achive this goal. Try to search papers for this. – M. Doosti Lakhani Sep 18 '18 at 16:02
  • @JeyaKumar I believe you would want to use `beta=255` since that is the range your thresholds are in. – Mason McGough Sep 18 '18 at 18:29
  • @MasonMcGough : Thanks for suggestion but still not much improvement. I have one more idea is it possible to convert all input images like "CAPTCHA : 2" (blue and white) once we get input image convert the background to blue and text to white then convert to grey scale . does this improve or stabilize low threshold – Jeya Kumar Sep 18 '18 at 18:52

2 Answers2

0

Have you tried Histogram Equalization?

You can make sevaral images' data divergence more stable.

import cv2

# Load an color image in grayscale
img = cv2.imread('it_captcha3.jpg',0)
img = cv2.equalizeHist(img)
ret, thresh_img = cv2.threshold(img, 215, 255, cv2.THRESH_BINARY_INV)

cv2.imshow('grey image',thresh_img)
cv2.imwrite("result11.jpg", thresh_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In my trial, threshold low value is fixed with 215.

Algopark
  • 181
  • 4
0

Look at opencv documentation about https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html

This algorithm is suitable for your application

and I suggest that you study about image binarization.

+)

You can also take a look at other solutions

OpenCV binary adaptive threshold OCR

OpenCV Adaptive Threshold OCR

developer0hye
  • 183
  • 3
  • 8
  • Hi, Thanks for suggestion.I already tried those options unfortunately that doesnt helped. Is there any other way to remove those wavy horizontal lines – Jeya Kumar Sep 20 '18 at 17:58