0

I am converting a large invoice image in to pieces line by line to get text using ocr. I am using sobel egde detection and it is working perfectly fine. I am cutting image where sum of edges = 0

EI = edge(im,'Sobel',([]),'Vertical');
histy = sum(EI,2);

Now, issue arises when the source image is not vertically at 90 degree. Images are scanned through scanner and orientation might not correct and given technique get fail. Below is a sample image which is failing as every row has edges. As, now row has zero edges as rows are not horizontally aligned. Below is the image that is causing problem.

enter image description here

rayryeng
  • 102,964
  • 22
  • 184
  • 193
Ahmed Bilal
  • 137
  • 2
  • 11
  • 2
    Why Sobel? It's a binarized image, you can directly look for rows that are mostly white. – Cris Luengo Feb 27 '18 at 21:59
  • 1
    Try Googling *"deskew text"*. – Mark Setchell Feb 27 '18 at 22:59
  • Building on @MarkSetchell 's comments, the duplicate I've marked takes an object, fills in holes and finds the orientation of it. It then rotates the object so that it is axis aligned. You can then apply your original logic after. Specifically, you can apply the duplicate with your white blob of text on a black background. The white blob would be considered an object and you can apply the same principle. I would recommend zero-padding the image first prior to using the above logic to ensure you get the full extent of the white blob of text to accurately compute the orientation angle. – rayryeng Feb 28 '18 at 07:23
  • @MarkSetchell Deskew is doing near what i can expect but not 100% – Ahmed Bilal Feb 28 '18 at 07:39

0 Answers0