Identify Table Cells Individually (Separately) using Python

Question

I have a table, a clear table with vertical and horizontal grid lines (grid lines are sometimes black and sometimes white, it is possible to know this in advance).

I'm trying to find a way to locate each cell on the table photo individually, each cell got different properties (text, color, number, link, etc...) and I want to allow the user to perform some analysis on each cell before submitting.

When I'll show the user a given cell I will also show him the first cell from that row and the header cell for that column.

I've been searching the internet for the past 2-3 hours and found nothing, my code got me nowhere yet so there is no point in pasting it.

Some links I've tried:

Most of them usually work for only extracting textual data from an image, but I do not want to extract any data, I simply want as a start to detect all of the cells in the table and display the user each of the cells (for example in a loop, show him each cell [as is, without modification] with the corresponding header and row index, meaning each iteration will show the user 3 things: {1} the cell {2} the header {3} the row index)

Example Image (Actual data is classified so I found a Google image to show the principle Im looking for):

Image Link

I know I didn't paste any code, it's because none of the tries I did worked even a little bit, I really have no idea what to do...

If you have anything you think I can do better in order to improve my question, please tell me

Nicolas Busca · Answer 1 · 2023-04-17T09:54:21.537

2

(I've edited my answer to account for the fact that you can't remove the background colors)

The following code gives you the cells:

img = cv.imread("your_img.png",cv.IMREAD_GRAYSCALE)
## detect edges in the image. Will get the table cells and the text
## the text will be removed later
edges = cv.Canny(img,10,20)
## make the edges a bit thicker
edges = cv.dilate(edges, np.ones((3,3))) 

## invert image
_, edges = cv.threshold(edges,127,255,cv.THRESH_BINARY_INV)
## find the contours of the cells
conts,_  = cv.findContours(edges, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
## add a convex hull around to remove the text
conts = [cv.convexHull(cont) for cont in conts]
## filter out noise
conts = [cont for cont in conts if cv.contourArea(cont) > 100]
## draw on an image
edges = cv.drawContours(edges*0, conts, -1,(1,),1)

's the final image

edited Apr 17 '23 at 09:54

answered Apr 17 '23 at 08:49

Nicolas Busca

1,100
7
14

I need the cells to remain as they are as I said – Dolev Mitz Apr 17 '23 at 08:51
it's only to detect the contours, your original image can stay as it is – Nicolas Busca Apr 17 '23 at 08:52
How will I remove the cells background then? each cell can have any background color, there is no limit to the styling it got – Dolev Mitz Apr 17 '23 at 08:54
you can do it on the spreadsheet, select all the cells and remove the background – Nicolas Busca Apr 17 '23 at 08:54
I do not have a spreadsheet, I have only the image – Dolev Mitz Apr 17 '23 at 08:58
got it, see the edits – Nicolas Busca Apr 17 '23 at 09:59
I think I might need to help rephrase my question, my meaning was, I need to give to the users, each cell individually as is, meaning each cell alone with its content, for the example only I said for example loop over all the cells and each time show my user each cell as is (without modification to the cell content) – Dolev Mitz Apr 17 '23 at 11:25
conts above is a list where each element is a contour around the cell. You can use that to extract each cell from the original. image – Nicolas Busca Apr 17 '23 at 12:25
Alright, I will try to use it, but how can I crop each cell individually and display it? – Dolev Mitz Apr 18 '23 at 03:50
check this: https://stackoverflow.com/questions/15424852/region-of-interest-opencv-python – Nicolas Busca Apr 18 '23 at 06:15
I tried the codes on other examples, as this one: [https://i.stack.imgur.com/xuxzg.jpg](https://i.stack.imgur.com/xuxzg.jpg) but it doesn't seem to work – Dolev Mitz Apr 25 '23 at 03:45
Yes, I would expect that you can adapt the answer above to your particular problem – Nicolas Busca Apr 25 '23 at 06:00
To be honest I tried without luck, but I'll keep on trying – Dolev Mitz Apr 25 '23 at 07:04

Identify Table Cells Individually (Separately) using Python

1 Answers1