0

I have this image (some information was deleted from this on purpose)

What I need is some kind of way to remove the borders(lines) around the text.

I am doing OCR on these images and the lines are really in the way for text recognition.

Also everything has to work automatically, OCR and all other scripts get executed on the server side when someone uploads a document.

Tomkis90
  • 31
  • 6

2 Answers2

0

You could try using a Hough transform to detect all straight lines in the image, then all you need to do is mask them.

bgordon
  • 149
  • 2
  • 15
  • how would I change the code so that I can pass the image through the command line(terminal) – Tomkis90 Jun 04 '18 at 11:27
  • If you write the code as a Python script you can run it using the command line (i.e. `python process_image.py path/to/image.png`). [See here](https://stackoverflow.com/questions/1009860/how-to-read-process-command-line-arguments) for how to use system arguments with your script. – bgordon Jun 04 '18 at 11:32
  • I've tried it and it also detects text, which is the problem because i need to keep the text as is and only delete the lines – Tomkis90 Jun 04 '18 at 11:44
0

You can use Leptonica to remove lines.

http://www.leptonica.com/line-removal.html https://github.com/DanBloomberg/leptonica/blob/master/prog/lineremoval_reg.c

nguyenq
  • 8,212
  • 1
  • 16
  • 16