Extracting table structures from image

Question

I have a bunch of images like

What would be the good way to extract just the table structure from the image? I'm only interested extracting the straight lines.

I have been toying around with OpenCV Finding Contours code sample and the results are quite promising. I'm just wondering if there is maybe a better way?

May you could try also [HoughLineTransform](http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html), get all **horizontal lines** and get ROI based on minimum y and max x coordinates (basically two diagonal corners of the ROI - rectangle here) — Rick M., Jul 13 '17 at 18:18
I have tried http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html but the result is pretty bad. — chhenning, Jul 13 '17 at 18:38
Ok that is strange, so if I understand correctly, you want to extract just the table in between right? — Rick M., Jul 14 '17 at 08:04
I just like to extract the grid of horizontal and vertical lines. — chhenning, Jul 14 '17 at 14:20
In that case you could also try [CCA](http://docs.opencv.org/3.1.0/d3/dc0/group__imgproc__shape.html#gae57b028a2b2ca327227c2399a9d53241) — Rick M., Jul 14 '17 at 14:32
This looks like worth a try! https://stackoverflow.com/questions/10196198/how-to-remove-convexity-defects-in-a-sudoku-square/10226971#10226971 — chhenning, Mar 13 '18 at 20:44

score 6 · Answer 1 · answered Mar 17 '18 at 12:50

OpenCV has a nice way to detect line segments. Here is a code snippet in python:

import math
import numpy as np
import cv2

img = cv2.imread('page2.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

lsd = cv2.createLineSegmentDetector(0)
dlines = lsd.detect(gray)

for dline in dlines[0]:
    x0 = int(round(dline[0][0]))
    y0 = int(round(dline[0][1]))
    x1 = int(round(dline[0][2]))
    y1 = int(round(dline[0][3]))
    cv2.line(img, (x0, y0), (x1,y1), 255, 1, cv2.LINE_AA)

    # print line segment length
    a = (x0-x1) * (x0-x1)
    b = (y0-y1) * (y0-y1)
    c = a + b
    print(math.sqrt(c))

cv2.imwrite('page2_lines.png', img)

I wonder how to extract a segment, created by your code. Any suggesstions? — explorer, Nov 16 '18 at 08:22
This doesnt give me all the lines segments in the table :( is there any quick fix for the code ? — Sundeep Pidugu, Nov 30 '18 at 05:51

score 1 · Answer 2 · answered Dec 29 '19 at 17:25

Kindly go through my Github repository Code for table extraction

The developed code detect table and extract out information by keeping the spatial coordinates intact.

The code detects lines from tables as shown in an image below. I hope it solves your problem.

The extracted output in terms of a table is shown below.

Extracting table structures from image

2 Answers2