2

I want to deskew an image using. To do that I wrote (admittedly with lots of help) a program that:

  1. transforms image to be a easier to compute (thresh, dilation, etc.)
  2. draws contours around all objects
  3. computes four extreme points around the text contours (ignoring anything with a margin)
  4. draws a rectangle around that area using cv2.minAreaRect

The idea was that cv2.minAreaRect returns the angle as well, which I could use to deskew the image. However, in my case it's –90°.

You can see a sample input image here. You can see the result I get here.

I tested the program on a “clean” image (MS Word Screenshot rotaten ≈ 30° in Gimp) and it gave an identical result.

My code:

import numpy as np
import cv2
import itertools

img = cv2.imread('zuo.png')
imgray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

ret,thresh = cv2.threshold(imgray,64,255,0)
############
kernel = np.ones((2,2),np.uint8)
img_e = cv2.dilate(thresh,kernel,iterations = 1)
# cv2.imwrite("out_eroded.png", img_e)
# http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html
# img_e = thresh
############
imgbw, contours, hierarchy = cv2.findContours(img_e,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
# imgbw, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

margin_distance = 25

def flatten(arr, n = 1):
    # print(arr)
    ret = list(itertools.chain.from_iterable(arr))
    # print(ret)
    if n != 1:
        return flatten(ret, n - 1)
    else:
        return ret

# print(list(flatten([[1,2,3],[4,5,6], [7], [8,9]])))

def get_min_max_values(cs, im_y, im_x):
    # print(flatten(cs), 1)
    # print(im_y, im_x)
    min_y = im_y - margin_distance
    min_x = im_x - margin_distance
    max_y = margin_distance
    max_x = margin_distance
    for lvl1 in cs:
        for lvl2 in lvl1:
            x, y = lvl2[0]
            # x = im_x - x
            # y = im_y - y
            max_y = max(y, max_y) if y + margin_distance < im_y else max_y
            max_x = max(x, max_x) if x + margin_distance < im_x else max_x
            min_y = min(y, min_y) if y > margin_distance else min_y
            min_x = min(x, min_x) if x > margin_distance else min_x

    return ((min_y, min_x), (min_y, max_x), (max_y, min_x), (max_y, max_x))

new_rect = get_min_max_values(contours, len(img), len(img[0]))
new_rect = list(map(lambda x: list(x)[::-1], list(new_rect)))
print(new_rect)
rect = cv2.minAreaRect(np.int0(new_rect))
# print(rect)
print(rect)
box = cv2.boxPoints(rect)
box = np.int0(box)

img_out = cv2.drawContours(img, [box], -1, (0,0,255), 5) # -1 = wszystkie kontury
img_out = cv2.drawContours(img, contours, -1, (0,255,0), 3)

cv2.imwrite("out.png", img_out)

Why isn't the rectangle skewed to match the text? I don't see any artifacts that would justify that.

EDIT: Added clean, born digital files: input and output.

Dan Mašek
  • 17,852
  • 6
  • 57
  • 85
MrVocabulary
  • 597
  • 1
  • 11
  • 31
  • 1
    Can you plot (with a red circle or whatever) the points contained in `new_rect`? – Miki Feb 15 '17 at 17:15
  • can you add the GIMP image and result, too? – Micka Feb 15 '17 at 17:16
  • 1
    in get_min_max_values you return an axis aligned rect (corner points), so that's what minAreaRect optimizes. – Micka Feb 15 '17 at 17:20
  • @Miki do you meant generate a shape and draw it on the original file using these coordinates? – MrVocabulary Feb 15 '17 at 18:18
  • I was thinking about the same thing :D – Jeru Luke Feb 15 '17 at 18:19
  • 2
    I mean that you'd better draw those points (just use "circle" on your image) because they are probably wrong, as Micka said – Miki Feb 15 '17 at 18:21
  • @Micka added the Gimp result. I actually considered the mistake, but I have no idea how to rework this bit to rectify that problem—can you suggest what should I do more specifically? – MrVocabulary Feb 15 '17 at 18:21
  • Why don't you get the `minAreaRect` of all the points in `contours`? And avoid that `get_min_max_values` that seems wrong to me? – Miki Feb 15 '17 at 18:38
  • @Miki because that would just generate a contour around the entire file (from 0,0, to -1,-1). That plus scanning artifacts that could interfere is what made me think that a "ignore margin" is necessary here so that only text is taken into consideration. And I have no idea how to discard these directly from `contours`. – MrVocabulary Feb 15 '17 at 18:48
  • can you add a thresholded image? I would try either dilation or close operators and filter contour area sizes or filter contour area sizes directly. Anothet approach could be to start at the image center and sample "rays" to the image borders until the density of foreground pixels drops down. – Micka Feb 15 '17 at 22:00

1 Answers1

4

TLDR: Use convex hull instead of four points only!

Part 1: The error(s) in your current approach.

Your function get_min_max_values computes the corner points of axis-aligned bounding box of all contours. But what you actually want to compute here are the coordinates of the leftmost, topmost, rightmost and the botommost point of all the contours.

Instead of only "remembering" the minimal y, you have to retain both coordinates of the point where y was minimal (topmost point). The same applies for all other points.

The code below shows how to compute those points properly. I decided to keep the code snippet short and readable, that's why I only show how to compute leftmost and topmost point here. All four points are computed in the same way anyway ...

As you will notice, I do not compare (clamp) the points to the margin withing the loop; instead, I do this only once at the end of the loop since doing this produces the same results but the code is simpler.

def get_min_max_values(cs, im_height, im_width):

  min_y = im_height - margin_distance
  min_x = im_width - margin_distance

  left_point = (min_y, min_x)
  top_point = (min_y, min_x)

  for lvl1 in cs:
    for lvl2 in lvl1:
        x, y = lvl2[0]

        left_point = left_point if x > left_point[1] else (y, x) 
        top_point  = top_point if y > top_point[0]  else (y, x)

  left_point[0] = left_point[0] if left_point[0] > margin_distance else margin_distance + 1
  left_point[1] = left_point[1] if left_point[1] > margin_distance else margin_distance + 1

  top_point[0] = top_point[0] if top_point[0] > margin_distance else margin_distance + 1
  top_point[1] = top_point[1] if top_point[1] > margin_distance else margin_distance + 1


  return (top_point, left_point)

Now let us take look at the results:

enter image description here

You can see that all four "extremal" points are indeed inside the rotated rectangle but lots of other points remain outside because of the "minimal area" constraint. You need to take all "border" points into account when you compute minumum rotated bounding rectangle to make this work right.

Part 2: The solution that works and requires minimal changes in your code

After you compute the contours with findContours, you have to copy all those contour points to the same array and then finally you have pass that array to the convexHull function. This function computes the convex hull points. You then use those points as input for minAreaRect function and this is what you obtain:

enter image description here

Further improving your solution

I'm quite sure your algorithm can run much faster if you do not compute contours at all. Instead, just use the thresholded pixel positions as inputs for convex hull function.

Nejc
  • 927
  • 6
  • 15
  • Thanks for the suggestion and sample code. I tried extrapolating and adapting the remainder of the code you suggested, but it keeps returning really odd coordinates. I have an idea on how to apply your final suggestion, so let me accept your answer despite the problems I have with running this code. – MrVocabulary Feb 17 '17 at 07:54
  • I only now noticed you edited the post before I answered (did not refresh). Thanks a bunch for your help—the output you obtained is exactly what I was looking for! However, I have trouble reproducing your result—it just won't draw the rectangle. Could you take a look at my attempt at it and tell me what I am missing? http://pastebin.com/VEVHgtii – MrVocabulary Feb 17 '17 at 16:23
  • 1
    Hello, it looks you only have to make sure that you put *all* points from *all* found contours into a single array, which you then pass as argument to convexHull. By the way, don't forget about my advice to avoid using contours and use thresholded points directly instead; this will make job easier and the results will be the same. :) – Nejc Feb 17 '17 at 22:29
  • Ah, I see. I will try doing that. Your advice did not go unnoticed, but I mostly understand the code in the contour approach, while I don't have any idea yet how to send pixels directly. Once I make it work in _any_ way, however, I will be happy to try that better solution you proposed. In any case, thanks again! You are my favorite person this week. – MrVocabulary Feb 17 '17 at 22:39
  • 1
    How to get all thresholded pixels into an array (in pseudocode): http://pastebin.com/LdnAR3eH. Then use this array as convexHull argument. – Nejc Feb 17 '17 at 22:45
  • Now you are also my favorite person tomorrow. – MrVocabulary Feb 17 '17 at 22:50