10

I have a large collection of scanned images, and they are all somewhat skewed, with a white area around them.

So, these images have rectangles of colors, surrounded by a large white area. The problem is that these rectangles of color are not parallel to the image border.

I'm sure there must be a way to programmatically detect these rectangles of color, so that I can rotate the image (thus un-skewing it) and then crop it so that just the interesting part is left. I guess I'm not really sure what this process is called, so I am having trouble searching for a solution on Google.

Does anyone know of an approach that would get me started? Any libraries out there that I should look into? Or the name of an algorithm that would help?

I am planning on using Java for this project, but I haven't really started yet, so I am open to library suggestions in any language.

pkaeding
  • 36,513
  • 30
  • 103
  • 141

3 Answers3

3
  • border detection
  • hough transform (if all rectangles on an image have the same skew)
  • rectangle contour detection (connected component contour, then minimum area bounding rectangle)
rwong
  • 6,062
  • 1
  • 23
  • 51
  • Thanks! A search for Hough Transform led me to http://www.recognition-software.com/image/deskew/ which didn't solve my problem right out of the box, but I was able to tweak the code a bit to get it to work very well. – pkaeding Jul 04 '10 at 03:48
  • Was it subsumed by Tess4j? – wprl Dec 18 '13 at 23:00
3

Alyn is a third party package to detect and fix skew in images containing text. It uses Canny Edge Detection and Hough Transform to find skew.

To detect the skew, just run

./skew_detect.py -i image.jpg

To correct the skew, run

./deskew.py -i image.jpg  -o skew_corrected_image.jpg
Chillar Anand
  • 27,936
  • 9
  • 119
  • 136
1

You might also try scikit-image http://scikit-image.org/docs/dev/auto_examples/.

It's a great library for the hough transformation, but also has other methods like Radon transformation and geometric transformations for this kind of task.

This is a python library.

Dsmithos
  • 23
  • 4