6

I'm trying to extract the rotated bounding box of contours robustly. I would like to take an image, find the largest contour, get its rotated bounding box, rotate the image to make the bounding box vertical, and crop to size.

For a demonstration, here is an original image linked in the following code. I would like to end up with that shoe rotated to vertical and cropped to size. The following code from this answer seems to work on simple images like opencv lines, etc., but not on photos.

enter image description here

Ends up with this, which is rotated and cropped wrong:

enter image description here

EDIT: After changing the threshold type to cv2.THRESH_BINARY_INV, it now is rotated correctly but cropped wrong:

enter image description here

import cv2
import matplotlib.pyplot as plt
import numpy as np
import urllib.request
plot = lambda x: plt.imshow(x, cmap='gray').figure


url = 'https://i.imgur.com/4E8ILuI.jpg'
img_path = 'shoe.jpg'

urllib.request.urlretrieve(url, img_path)
img = cv2.imread(img_path, 0)
plot(img)


threshold_value, thresholded_img = cv2.threshold(
    img, 250, 255, cv2.THRESH_BINARY)
_, contours, _ = cv2.findContours(thresholded_img, 1, 1)
contours.sort(key=cv2.contourArea, reverse=True)

shoe_contour = contours[0][:, 0, :]
min_area_rect = cv2.minAreaRect(shoe_contour)

def crop_minAreaRect(img, rect):

    # rotate img
    angle = rect[2]
    rows, cols = img.shape[0], img.shape[1]
    M = cv2.getRotationMatrix2D((cols / 2, rows / 2), angle, 1)
    img_rot = cv2.warpAffine(img, M, (cols, rows))

    # rotate bounding box
    rect0 = (rect[0], rect[1], 0.0)
    box = cv2.boxPoints(rect)
    pts = np.int0(cv2.transform(np.array([box]), M))[0]
    pts[pts < 0] = 0

    # crop
    img_crop = img_rot[pts[1][1]:pts[0][1],
                       pts[1][0]:pts[2][0]]

    return img_crop


cropped = crop_minAreaRect(thresholded_img, min_area_rect)
plot(cropped)

How can I get the correct cropping?


Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
Hatshepsut
  • 5,962
  • 8
  • 44
  • 80
  • That script seems incomplete: `NameError: name 'min_area_rect' is not defined`. – Dan Mašek Jul 24 '17 at 22:40
  • @DanMašek Thanks, fixed. – Hatshepsut Jul 24 '17 at 22:43
  • No problem. As first step, I'd suggest using `cv2.THRESH_BINARY_INV`. At the top level, `findContours` looks for white objects on black background, so with a white background the biggest contour corresponds to the whole image. – Dan Mašek Jul 24 '17 at 22:55
  • `minAreaRect` is also a little tricky. For the full image bounding box, I get `((492.5, 415.5), (829.0, 983.0), -90.0)` -- notice that it says it's higher than wider, with angle -90 degrees. This needs to be accounted for, otherwise it rotates when it shouldn't. – Dan Mašek Jul 24 '17 at 23:03
  • @DanMašek The cropping isn't right, edited question to show. – Hatshepsut Jul 25 '17 at 00:51
  • Yeah, there's a bunch of issues with that answer. Problem is you can't just rotate, you need to translate as well in some cases (such as this one, where the rotated rectangle is wider than the original image). – Dan Mašek Jul 25 '17 at 00:57
  • I am also looking for an answer to this question. Have you finally solved this? – jdhao Feb 20 '19 at 14:35

1 Answers1

8

After some research, this is what I get:

enter image description here

This is how I get it:

  • pad the original image on each side (500 pixels in my case)
  • find the four corner points of the shoe (the four points should form a polygon enclosing the shoe, but do not need to be exact rectangle)
  • employing the code here to crop the shoe:

img = cv2.imread("padded_shoe.jpg")
# four corner points for padded shoe
cnt = np.array([
    [[313, 794]],
    [[727, 384]],
    [[1604, 1022]],
    [[1304, 1444]]
])
print("shape of cnt: {}".format(cnt.shape))
rect = cv2.minAreaRect(cnt)
print("rect: {}".format(rect))

box = cv2.boxPoints(rect)
box = np.int0(box)
width = int(rect[1][0])
height = int(rect[1][1])

src_pts = box.astype("float32")
dst_pts = np.array([[0, height-1],
                    [0, 0],
                    [width-1, 0],
                    [width-1, height-1]], dtype="float32")
M = cv2.getPerspectiveTransform(src_pts, dst_pts)
warped = cv2.warpPerspective(img, M, (width, height))

Cheers, hope it helps.

jdhao
  • 24,001
  • 18
  • 134
  • 273