1

What is a good way to parse an image of a table? I saw this question that is counting the number of x's in the table but it relies on having an image of x to search with.

Here is an example image of a table I would like to parse:

enter image description here

In my case the data would be mostly numbers. How can I extract cells from the table image so that there are separate images for each cell that can be used for OCR and a correct order of data? Does some sort of machine learning solution exist, rather than computer vision?

By robust I mean:

  • Works with different cell backgrounds
  • Does not fail with thicker or thinner outlines, or no outlines at all
  • Works with different spacing between columns / rows
rawsh
  • 395
  • 5
  • 21
  • In case you are open for it, I think the simplest way would be to try some custom object detection API like http://app.nanonets.com/ObjectCategorySelection/ – pratsJ Dec 05 '17 at 03:26
  • And match text? I guess I will try – rawsh Dec 05 '17 at 15:03
  • solution at https://stackoverflow.com/questions/33452222/detect-table-with-opencv/46806306#46806306 might be helpful with some small tweaks. – flamelite Dec 11 '17 at 08:19

1 Answers1

1

The OCR API seems to offer some table-related functionality. I just found it, so I have no further insight, but you might want to check it out. You can do an online test, where you can check the following box

Do receipt scanning and/or table recognition

My results were okay. Single letters weren't found but overall text and numbers were recognized.

Honeybear
  • 2,928
  • 2
  • 28
  • 47