-1

I have a table, a clear table with vertical and horizontal grid lines (grid lines are sometimes black and sometimes white, it is possible to know this in advance).

I'm trying to find a way to locate each cell on the table photo individually, each cell got different properties (text, color, number, link, etc...) and I want to allow the user to perform some analysis on each cell before submitting.

When I'll show the user a given cell I will also show him the first cell from that row and the header cell for that column.

I've been searching the internet for the past 2-3 hours and found nothing, my code got me nowhere yet so there is no point in pasting it.

Some links I've tried:

  1. Generating tables with images in cells using Python
  2. Split cells from an image of a table
  3. Recognize cells from a table with Open CV and display the recognition result in QGraphicsView of PyQt5
  4. How to get individual cell in table using Pandas?
  5. Detect columns from a table with opencv
  6. detect a table part from entire image in python

Most of them usually work for only extracting textual data from an image, but I do not want to extract any data, I simply want as a start to detect all of the cells in the table and display the user each of the cells (for example in a loop, show him each cell [as is, without modification] with the corresponding header and row index, meaning each iteration will show the user 3 things: {1} the cell {2} the header {3} the row index)

Example Image (Actual data is classified so I found a Google image to show the principle Im looking for):

Google image

Image Link

I know I didn't paste any code, it's because none of the tries I did worked even a little bit, I really have no idea what to do...

If you have anything you think I can do better in order to improve my question, please tell me

Dolev Mitz
  • 103
  • 14

1 Answers1

2

(I've edited my answer to account for the fact that you can't remove the background colors)

The following code gives you the cells:

img = cv.imread("your_img.png",cv.IMREAD_GRAYSCALE)
## detect edges in the image. Will get the table cells and the text
## the text will be removed later
edges = cv.Canny(img,10,20)
## make the edges a bit thicker
edges = cv.dilate(edges, np.ones((3,3))) 

## invert image
_, edges = cv.threshold(edges,127,255,cv.THRESH_BINARY_INV)
## find the contours of the cells
conts,_  = cv.findContours(edges, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
## add a convex hull around to remove the text
conts = [cv.convexHull(cont) for cont in conts]
## filter out noise
conts = [cont for cont in conts if cv.contourArea(cont) > 100]
## draw on an image
edges = cv.drawContours(edges*0, conts, -1,(1,),1)

Here's the final image

Nicolas Busca
  • 1,100
  • 7
  • 14