1

I have the following table:

enter image description here

I want to write a script that creates lines based on the natural breakages on the table text. The result would look like this:

enter image description here

Is there an OpenCV implementation that makes drawing these lines possible? I looked at the answers to the questions here and here, but neither worked. What is the best approach to solving this problem?

mmz
  • 1,011
  • 1
  • 8
  • 21
  • 1
    Convert to grayscale and threshold so everything that is white stays white and all else becomes black. Then count the number of non-zero pixels (np.count_nonzero) in each row. Your lines would be drawn in the middle of each group of fully white rows. – fmw42 Apr 17 '21 at 16:45
  • Exactly what I needed, thank you! Was having a brain block for some reason – mmz Apr 17 '21 at 17:27

1 Answers1

1

Here is one way to get the horizontal lines in Python/OpenCV by counting the number of white pixels in each row of the image to find their center y values. The vertical lines can be added by a similar process.

Input:

enter image description here

import cv2
import numpy as np

# read image
img = cv2.imread("table.png")
hh, ww = img.shape[:2]

# convert to grayscale 
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# threshold gray image
thresh = cv2.threshold(gray, 254, 255, cv2.THRESH_BINARY)[1]

# count number of non-zero pixels in each row
count = np.count_nonzero(thresh, axis=1)

# threshold count at ww (width of image)
count_thresh = count.copy()
count_thresh[count==ww] = 255
count_thresh[count<ww] = 0
count_thresh = count_thresh.astype(np.uint8)

# get contours
contours = cv2.findContours(count_thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# loop over contours and get bounding boxes and ycenter and draw horizontal line at ycenter
result = img.copy()
for cntr in contours:
    x,y,w,h = cv2.boundingRect(cntr)
    ycenter = y+h//2
    cv2.line(result, (0,ycenter), (ww-1,ycenter), (0, 0, 0), 2)

# write results
cv2.imwrite("table_thresh.png", thresh)
cv2.imwrite("table_lines.png", result)

# display results
cv2.imshow("THRESHOLD", thresh)
cv2.imshow("RESULT", result)
cv2.waitKey(0)

Threshold Image:

enter image description here

Result with lines:

enter image description here

ADDITION

Here is an alternate method that is slightly simpler. It averages the image down to one column rather than counting white pixels.

import cv2
import numpy as np

# read image
img = cv2.imread("table.png")
hh, ww = img.shape[:2]

# convert to grayscale 
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# average gray image to one column
column = cv2.resize(gray, (1,hh), interpolation = cv2.INTER_AREA)

# threshold on white
thresh = cv2.threshold(column, 254, 255, cv2.THRESH_BINARY)[1]

# get contours
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# loop over contours and get bounding boxes and ycenter and draw horizontal line at ycenter
result = img.copy()
for cntr in contours:
    x,y,w,h = cv2.boundingRect(cntr)
    ycenter = y+h//2
    cv2.line(result, (0,ycenter), (ww-1,ycenter), (0, 0, 0), 2)

# write results
cv2.imwrite("table_lines2.png", result)

# display results
cv2.imshow("RESULT", result)
cv2.waitKey(0)

Result:

enter image description here

fmw42
  • 46,825
  • 10
  • 62
  • 80
  • very helpful - more elegant than what I was able to come up with – mmz Apr 17 '21 at 22:27
  • Update: I ran the code and it works beautifully. Thanks again for the thoughtful answer! – mmz Apr 17 '21 at 23:07
  • See my addition in my answer for a slightly simpler method. – fmw42 Apr 17 '21 at 23:23
  • the simpler method works well. If it wouldn't be too much of a hassle, I was wondering how you'd implement vertical lines using the simpler method. I tried changing `column = cv2.resize(gray, (1, hh), interpolation=cv2.INTER_AREA) thresh = cv2.threshold(column, 254, 255, cv2.THRESH_BINARY)[1]` to `row = cv2.resize(gray, (1, ww), interpolation=cv2.INTER_AREA) thresh = cv2.threshold(row, 254, 255, cv2.THRESH_BINARY)[1]` but, when calling `cv2.drawContours`, see the contours still being drawn row-wise. Am I missing something? This is my first day using OpenCV – mmz Apr 17 '21 at 23:42
  • almost got it working using the first version. am just having trouble with the centering value. I currently have `xcenter = y+h//2`, which results in lines that are slightly off. (I used another image since the solid header interfered with column identification.) – mmz Apr 18 '21 at 00:57
  • center = x+w//2 if you are doing the vertical lines – fmw42 Apr 18 '21 at 15:52
  • This is very helpful, I use the code for my own work. It drew the horizontal ones successfully, but fail o generate the vertical ones? Can you have a look at it? https://stackoverflow.com/questions/68523822/draw-line-on-a-gridless-image-python-opencv – Jia Gao Jul 26 '21 at 02:19
  • Can you add an explanation to your code? I've tried it for my own table image but it didn't draw anything :( – Dolev Mitz Feb 14 '23 at 05:29
  • I created a question about it @fmw42 https://stackoverflow.com/questions/75444109/draw-grid-on-a-gridless-fully-or-partially-table-image – Dolev Mitz Feb 14 '23 at 09:21
  • Your image does not have any rows with pure white due to the vertical lines. So you need to adjust things like the threshold (lower it) or use a different method. – fmw42 Feb 14 '23 at 17:29