0

I have an image like this below. [![enter image description here][2]][2] I try to add both vertical and horizontal lines between rows and columns. I succeeded in adding the horizontal ones using the code provided [here][1]. But I failed to add the vertical ones. Can anyone help to point out what went wrong with my code?

import cv2
import numpy as np
import matplotlib.pyplot as plt
file = r"C:/Users/gaojia/Dropbox/Projects/Parking_lot/Community_Group_Buying/scripts/test_2.png"
# read image
img = cv2.imread(file)
hh, ww = img.shape[:2]

# convert to grayscale 
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# threshold gray image
thresh = cv2.threshold(gray, 254, 255, cv2.THRESH_BINARY)[1]

# count number of non-zero pixels in each row
count = np.count_nonzero(thresh, axis=0)

# threshold count at ww (width of image)
count_thresh = count.copy()
count_thresh[count==hh] = 255
count_thresh[count<hh] = 0
count_thresh = count_thresh.astype(np.uint8)

# get contours
contours = cv2.findContours(count_thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# loop over contours and get bounding boxes and ycenter and draw horizontal line at ycenter
result = img.copy()
for cntr in contours:
    x,y,w,h = cv2.boundingRect(cntr)
    xcenter = x+w//2
    cv2.line(result, (xcenter,0), (xcenter, hh-1), (0, 0, 0), 2)


# display results
cv2.imshow("THRESHOLD", thresh)
cv2.imshow("RESULT", result)
cv2.waitKey(0)

The code above draws only one vertical line at the far left of the image. I removed the vertical line in the original image, but the result remains the same.

-----------------------EDIT-----------------------------

As is in the first answer, I realized that the input for findContours() should not be a one-dimensional array. I thus replaced the following code:

# count number of non-zero pixels in each row
count = np.count_nonzero(thresh, axis=0)

# threshold count at ww (width of image)
count_thresh = count.copy()
count_thresh[count==hh] = 255
count_thresh[count<hh] = 0
count_thresh = count_thresh.astype(np.uint8)

with:

# find row index with any value equals 0
row_zero = np.nonzero(np.any(thresh == 0, axis=1))[0]
# replace values in column with any 0 with 0.
thresh[row_zero, :] =  0

This adds the vertical lines between columns of text. [1]: Python & OpenCV: How to add lines to gridless table [2]: https://i.stack.imgur.com/YnPns.png

Jia Gao
  • 1,172
  • 3
  • 13
  • 26

1 Answers1

1

Your issue is that count is one dimensional no matter whether along axis 0 or 1. So you have to transpose the x,y and w,h in your bounding boxes to use as you did in Python/OpenCV.

Input:

enter image description here

import cv2
import numpy as np

# read image
img = cv2.imread("gridless_table.png")
hh, ww = img.shape[:2]

# convert to grayscale 
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# threshold gray image
thresh = cv2.threshold(gray, 254, 255, cv2.THRESH_BINARY)[1]

# count number of non-zero pixels in each column
count = np.count_nonzero(thresh, axis=0)

# threshold count at hh (height of image)
count_thresh = count.copy()
count_thresh[count==hh] = 255
count_thresh[count<hh] = 0
count_thresh = count_thresh.astype(np.uint8)

# get contours
contours = cv2.findContours(count_thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# loop over contours and get bounding boxes and xcenter and draw vertical line at ycenter
result = img.copy()
for cntr in contours:
    # must transpose x,y and w,h since count is one-dimensional but represents each column
    y,x,h,w = cv2.boundingRect(cntr)
    print(x,y,w,h)
    xcenter = x+w//2
    cv2.line(result, (xcenter,0), (xcenter, hh-1), (0, 0, 0), 2)

# save results
cv2.imwrite("gridless_table_lines.png", result)

# display results
cv2.imshow("THRESHOLD", thresh)
cv2.imshow("RESULT", result)
cv2.waitKey(0)

Result:

enter image description here

fmw42
  • 46,825
  • 10
  • 62
  • 80
  • Thanks @fmw42, I also found that the input to the `findContours` function was wrong, and I changed it, everything else of your code works great. Just a quick follow-up, how do you suggest I, a beginner of openCV, extract information from a table like this? – Jia Gao Jul 26 '21 at 08:42
  • @JasonGoal use pytesseract (Optical Character Recognition aka OCR) to extract the text/data from an image. See the following links for good introductory tutorials on how to do this: https://medium.com/analytics-vidhya/table-detection-and-text-extraction-5a2934f61caa https://towardsdatascience.com/a-table-detection-cell-recognition-and-text-extraction-algorithm-to-convert-tables-to-excel-files-902edcf289ec – as_owl Dec 04 '21 at 19:03
  • Can anyone tell me how can I do this horizontally ? I mean How can I draw lines horizontally for each row in this image ? – Koushik Jan 28 '23 at 07:03
  • Read the documentation for np.count_nonzero. Just change the axis to horizontal. Replace `np.count_nonzero(thresh, axis=0)` with `np.count_nonzero(thresh, axis=1)` – fmw42 Jan 28 '23 at 17:05
  • This only draws 2 vertical lines for me, only on the right most and left most sides of the image, which are the ends of the table – Dolev Mitz Feb 14 '23 at 05:33
  • Adjust the threshold. – fmw42 Feb 14 '23 at 17:23