3

original image

I have an image where I have a horizontal line underlying the text ; after applying through various techniques in order a. HoughLineP and HoughLine and this code

 image = cv2.imread('D:\\detect_words.jpg')
 gray = 255 - cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
 for row in range(gray.shape[0]):
    avg = np.average(gray[row, :] > 16)
    if avg > 0.25:
        cv2.line(image, (0, row), (gray.shape[1]-1, row), (0, 0, 255))
        cv2.line(gray, (0, row), (gray.shape[1]-1, row), (0, 0, 0), 1)
  cv2.imwrite('D:\\words\\final_removed.jpg',image)

I am able to get to this after processing

after this phase; I am applying erosion and dilation

kernel = np.ones((3,3), np.uint8) 
img_erosion = cv2.erode(255-gray, kernel, iterations=1) 
img_dilation = cv2.dilate(img_erosion, kernel, iterations=1) 
cv2.imwrite('D:\\words\\final_removed4.jpg',255-img_dilation)

final image after dilation and erosion

My question is; removing the horizontal lines although removes but there is pixel loss for words; and not all the horizontal lines are removed. Is there a novel approch where this loss can be minimized and all horizontal lines are removed (here the horizontal lines above AGE is still present).

nathancy
  • 42,661
  • 14
  • 115
  • 137
IamKarim1992
  • 646
  • 5
  • 20
  • Lower the 0.25 in avg > 0.25, so that t filters out smaller length lines. Or filter on actual line length. Once the lines are remove you can try morphology open to fill in the gaps in the text. – fmw42 Sep 07 '19 at 17:44
  • Is the top image in OP detect_words.jpg or just part of it? It seems like the line thickness of the white line you draw isn't thick enough to completely erase the bottom line.Regrettably line thickness is an integer so increasing to 2 might be too much. So you might have to scale the whole image so that thickness 2 is just right. – bfris Sep 08 '19 at 20:01
  • Possible duplicate of [Remove noisy lines from an image](https://stackoverflow.com/questions/54028493/remove-noisy-lines-from-an-image) – Cris Luengo Sep 09 '19 at 21:15

1 Answers1

3

Here's an approach:

  • Convert image to grayscale
  • Otsu's threshold to get binary image
  • Create horizontal kernel and morph open to detect lines
  • Find contours and draw in detected lines

After converting to grayscale, we Otsu's threshold to obtain a binary image

enter image description here

image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

Now we create a special horizontal kernel to detect horizontal lines then morph open to obtain a mask of the detected lines

enter image description here

horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (45,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)

Here's the detected lines drawn on the original image

enter image description here

From here we find contours on this mask and draw them in to effectively remove the horizontal lines to get our result

enter image description here

cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

for c in cnts:
    cv2.drawContours(image, [c], -1, (255,255,255), 3)

Now that the horizontal lines are removed, to repair the text, you can try cv2.MORPH_CLOSE with a cv2.MORPH_CROSS kernel and experiment with various kernel sizes. There is a tradeoff between dilating too much to close the holes as the detail in the text will be lost. Another approach is to use image inpainting to fill in the holes. I'll leave this step to you

Full code

import cv2

image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (45,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)

cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

for c in cnts:
    cv2.drawContours(image, [c], -1, (255,255,255), 3)

cv2.imshow('thresh', thresh)
cv2.imshow('detected_lines', detected_lines)
cv2.imshow('image', image)
cv2.waitKey()
nathancy
  • 42,661
  • 14
  • 115
  • 137
  • thank you so much for your response. How do i connect the dots of missing A, like i am doing erode and dilate but what matrix to pass in order to smoothen that. – IamKarim1992 Sep 12 '19 at 10:52
  • You can use `cv2.morphologyEx()` and try experimenting with [these matrixes](https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html#structuring-element) and various kernel sizes – nathancy Sep 12 '19 at 19:42