Cropping an image after Rotation, Scaling and Translation (with Python Transformation Matrix) such that there is no black background

Question

I have pairs of images of the same 2D object with very minor diferences. The two images of a pair have two reference points (a star [x_s,y_s] and an arrow-head [x_a,y_a]) as shown below:

The Image Pair

I have written a Python script to align one image with reference to the second image of the pair with the reference points/coordinates. Please go through the code below for a clear understanding:


import numpy as np
import cv2
import pandas as pd

# Function to align image2 with respect to image1:

def alignFromReferenceImage(image1, imgname1, image2, imgname2):
    
    # Using Panda dataframe to read the coordinate values ((x_s,y_s) and (x_a,y_a)) from a csv file
    #
    # The .csv file looks like this:-
    #
    #     id;x_s;y_s;x_a;y_a
    #     img11;113;433;45;56
    #     img12;54;245;55;77
    #     img21;33;76;16;88
    #     img22;62;88;111;312
    #     ...  ;..;..;...;  

    df = pd.read_csv("./image_metadata.csv",  delimiter= ';')

    # Eliminate .jpg from the image name and fetch the row

    filter_data=df[df.isin([imgname1.split('.')[0]]).any(1)]  
    
    x1_s=filter_data['x_s'].values[0]
    y1_s=filter_data['y_s'].values[0]
    
    x1_a=filter_data['x_a'].values[0]
    y1_a=filter_data['y_a'].values[0]

    filter_data2=df[df.isin([imgname2.split('.')[0]]).any(1)]
    
    x2_s=filter_data2['x_s'].values[0]
    y2_s=filter_data2['y_s'].values[0]
    
    x2_a=filter_data2['x_a'].values[0]
    y2_a=filter_data2['y_a'].values[0]
    
    tx=x2_s-x1_s
    ty=y2_s-y1_s
    
    rows,cols = image1.shape
    M = np.float32([[1,0,-tx],[0,1,-ty]])
    image_after_translation = cv2.warpAffine(image2,M,(cols,rows))
    
    d1 = math.sqrt((x1_a - x1_s)**2 + (y1_a - y1_s)**2)
    d2 = math.sqrt((x2_a - x2_s)**2 + (y2_a - y2_s)**2)
    
    dx1 = x1_a - x1_s
    dy1 = -(y1_a - y1_s)
    
    alpha1 = math.degrees(math.atan2(dy1, dx1))
    alpha1=(360+alpha1) if (alpha1<0) else alpha1
    
    dx2 = x2_a - x2_s
    dy2 = -(y2_a - y2_s)

    alpha2 = math.degrees(math.atan2(dy2, dx2))
    alpha2=(360+alpha2) if (alpha2<0) else alpha2
    
    ang=alpha1-alpha2
    
    scale = d1 / d2 
    
    centre = (filter_data['x_s'].values[0], filter_data['y_s'].values[0])
    
    M = cv2.getRotationMatrix2D((centre),ang,scale)
    aligned_image = cv2.warpAffine(image_after_translation, M, (cols,rows))

    return aligned_image

After alignment, the image looks as shown below:

Image After Alignment

Important: Now, after aligning the first image with respect to the second image, I want to crop the aligned image in such a way that the image will no longer have the black background after cropping. The picture below will clearly explain what I want to do:

Image After Cropping

I have researched on it and found some useful links:

But these posts only discuss about rotation and I have no clue how the maths work for translation and scaling. Any help in this problem would be highly appreciated.

so you want to _inscribe_ a rectangle into a rotated rectangle? I think that has been discussed before. -- these things discuss rotation because translation and scaling are trivial in comparison. all you have is a rectangle with different corner coordinates. — Christoph Rackwitz, Nov 23 '22 at 15:43
I am trying to understand the math behind it. I was just wondering if there is some shortcuts or opencv library that can do the trick. But it seems to be a non-linear optimization problem to fit the largest rectangle inside a rotated one. — Neil, Nov 23 '22 at 16:13
There are multiple possible rectangles that you can crop in your example, are you saying you don’t care which of them you get? — Cris Luengo, Nov 25 '22 at 23:38
I want to crop out the largest rectangle since I want to train a CNN which means more area would mean more information about the object. — Neil, Nov 27 '22 at 10:41

score 2 · Answer 1 · answered Nov 24 '22 at 01:03

If you want "any help" and are willing to use Imagemagick 7, then there is a simple solution using its aggressive trim.

Input:

magick -fuzz 20% img.png +repage -bordercolor black -border 2 -background black -define trim:percent-background=0% -trim +repage img_trim.png

fmw42 · Answer 2 · 2022-11-25T22:07:18.730

Here is a Python/OpenCV solution. It first thresholds the image so that the background is black and the rest is white. It tests each edge of the threshold image and computes the mean and looks for the edge with the lowest mean. It stops on that edge if the mean==255. If not, then it trims off that edge and repeats. Once all edges have a mean of 255, it stops completely and uses the increments on each side to compute the crop of the original input.

Input:

Note: I had to adjust the crop of your posted image to ensure the background on all sides was pure black. It would have helped if you have provided separate images. If the sides were still slightly gray, then I would have increased the upper threshold limit.

import cv2
import numpy as np

# read image
img = cv2.imread('star_arrow.png')
h, w = img.shape[:2]

# threshold so border is black and rest is white. Note this is has pure black for the background, so threshold at black and invert. Adjust lower and upper if the background is not pure black.
lower = (0,0,0)
upper = (0,0,0)
mask = cv2.inRange(img, lower, upper)
mask = 255 - mask

# define top and left starting coordinates and starting width and height
top = 0
left = 0
bottom = h
right = w

# compute the mean of each side of the image and its stop test
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )

mean_minimum = min(mean_top, mean_left, mean_bottom, mean_right)

top_test = "stop" if (mean_top == 255) else "go"
left_test = "stop" if (mean_left == 255) else "go"
bottom_test = "stop" if (mean_bottom == 255) else "go"
right_test = "stop" if (mean_right == 255) else "go"

# iterate to compute new side coordinates if mean of given side is not 255 (all white) and it is the current darkest side
while top_test == "go" or left_test == "go" or right_test == "go" or bottom_test == "go":

    # top processing
    if top_test == "go":
        if mean_top != 255:
            if mean_top == mean_minimum:
                top += 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("top",mean_top)
                continue
        else:
            top_test = "stop"   

    # left processing
    if left_test == "go":
        if mean_left != 255:
            if mean_left == mean_minimum:
                left += 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("left",mean_left)
                continue
        else:
            left_test = "stop"  

    # bottom processing
    if bottom_test == "go":
        if mean_bottom != 255:
            if mean_bottom == mean_minimum:
                bottom -= 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("bottom",mean_bottom)
                continue
        else:
            bottom_test = "stop"    

    # right processing
    if right_test == "go":
        if mean_right != 255:
            if mean_right == mean_minimum:
                right -= 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("right",mean_right)
                continue
        else:
            right_test = "stop" 


# crop input
result = img[top:bottom, left:right]

# print crop values 
print("top: ",top)
print("bottom: ",bottom)
print("left: ",left)
print("right: ",right)
print("height:",result.shape[0])
print("width:",result.shape[1])

# save cropped image
#cv2.imwrite('border_image1_cropped.png',result)
cv2.imwrite('img_cropped.png',result)
cv2.imwrite('img_mask.png',mask)

# show the images
cv2.imshow("mask", mask)
cv2.imshow("cropped", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

Threshold Image:

Cropped Input:

ADDITION

Here is a version that shows an animation of the processing when run.

import cv2
import numpy as np

# read image
img = cv2.imread('star_arrow.png')
h, w = img.shape[:2]

# threshold so border is black and rest is white (invert as needed)
lower = (0,0,0)
upper = (0,0,0)
mask = cv2.inRange(img, lower, upper)
mask = 255 - mask

# define top and left starting coordinates and starting width and height
top = 0
left = 0
bottom = h
right = w

# compute the mean of each side of the image and its stop test
mean_top = np.mean( mask[top:top+1, left:right] )
mean_left = np.mean( mask[top:bottom, left:left+1] )
mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
mean_right = np.mean( mask[top:bottom, right-1:right] )

mean_minimum = min(mean_top, mean_left, mean_bottom, mean_right)

top_test = "stop" if (mean_top == 255) else "go"
left_test = "stop" if (mean_left == 255) else "go"
bottom_test = "stop" if (mean_bottom == 255) else "go"
right_test = "stop" if (mean_right == 255) else "go"

result = img[top:bottom, left:right]
cv2.imshow("result", result)
cv2.waitKey(100)

# iterate to compute new side coordinates if mean of given side is not 255 (all white) and it is the current darkest side
while top_test == "go" or left_test == "go" or right_test == "go" or bottom_test == "go":

    # top processing
    if top_test == "go":
        if mean_top != 255:
            if mean_top == mean_minimum:
                top += 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("top",mean_top)
                result = img[top:bottom, left:right]
                cv2.imshow("result", result)
                cv2.waitKey(100)
                continue
        else:
            top_test = "stop"   

    # left processing
    if left_test == "go":
        if mean_left != 255:
            if mean_left == mean_minimum:
                left += 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("left",mean_left)
                result = img[top:bottom, left:right]
                cv2.imshow("result", result)
                cv2.waitKey(100)
                continue
        else:
            left_test = "stop"  

    # bottom processing
    if bottom_test == "go":
        if mean_bottom != 255:
            if mean_bottom == mean_minimum:
                bottom -= 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("bottom",mean_bottom)
                result = img[top:bottom, left:right]
                cv2.imshow("result", result)
                cv2.waitKey(100)
                continue
        else:
            bottom_test = "stop"    

    # right processing
    if right_test == "go":
        if mean_right != 255:
            if mean_right == mean_minimum:
                right -= 1
                mean_top = np.mean( mask[top:top+1, left:right] )
                mean_left = np.mean( mask[top:bottom, left:left+1] )
                mean_bottom = np.mean( mask[bottom-1:bottom, left:right] )
                mean_right = np.mean( mask[top:bottom, right-1:right] )
                mean_minimum = min(mean_top, mean_left, mean_right, mean_bottom)
                #print("right",mean_right)
                result = img[top:bottom, left:right]
                cv2.imshow("result", result)
                cv2.waitKey(100)
                continue
        else:
            right_test = "stop" 


# crop input
result = img[top:bottom, left:right]

# print crop values 
print("top: ",top)
print("bottom: ",bottom)
print("left: ",left)
print("right: ",right)
print("height:",result.shape[0])
print("width:",result.shape[1])

# save cropped image
cv2.imwrite('img_cropped.png',result)
cv2.imwrite('img_mask.png',mask)

# show the images
cv2.waitKey(0)
cv2.destroyAllWindows()

I posted an ADDITION in my answer that is a version of the code that shows an animation of the trimming when the code is run. — fmw42, Nov 25 '22 at 22:08
`@Cris Luengo` No guarantees for largest rectangle. But often is the case or close. But can get very bad results if there is any non-black (near-black) along the edges. So the threshold is very important to ensure the background is uniform black in the mask with no extraneous white pixels. — fmw42, Nov 25 '22 at 23:43

Cropping an image after Rotation, Scaling and Translation (with Python Transformation Matrix) such that there is no black background

2 Answers2