1

I need to extract text segments from a grid image for OCR. I have tried multiple approaches such as HoughLines, Connected Components, Morpological operations etc. but I haven't got satisfactory results. Could anyone suggest a better approach? I have attached a few sample images

Sample Image Sample Image

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
Srini
  • 23
  • 5
  • I could totally solve that with Mathematical Morphology, but you want something better... sigh. Saying you've tried something is not helpful. If you want help, show what you tried and why it was not satisfactory. We could maybe help improve your approach. We're not going to write all the code for you. – Cris Luengo Mar 21 '18 at 01:43
  • I have given connected components based grid/line detection and removal here https://stackoverflow.com/a/46806306/5545458 which might be helpful. – flamelite Mar 21 '18 at 05:10
  • One of my favorite methods to tinker with in a case like this, and something to be aware of: https://www.microsoft.com/en-us/research/publication/stroke-width-transform/ – Rethunk Mar 23 '18 at 02:16

1 Answers1

0

Unless you are trying to do OCR from the ground up, I'd recommend using Tesseract. The approaches you listed would only be the feature extraction step in the OCR process. This python implementation is pretty simple and does the heavy lifting for you. Best of luck!

Jello
  • 420
  • 6
  • 13