Padding scipy affine_transform output to show non-overlapping regions of transformed images

Question

I have source (src) image(s) I wish to align to a destination (dst) image using an Affine Transformation whilst retaining the full extent of both images during alignment (even the non-overlapping areas).

I am already able to calculate the Affine Transformation rotation and offset matrix, which I feed to scipy.ndimage.interpolate.affine_transform to recover the dst-aligned src image.

The problem is that, when the images are not fuly overlapping, the resultant image is cropped to only the common footprint of the two images. What I need is the full extent of both images, placed on the same pixel coordinate system. This question is almost a duplicate of this one - and the excellent answer and repository there provides this functionality for OpenCV transformations. I unfortunately need this for scipy's implementation.

Much too late, after repeatedly hitting a brick wall trying to translate the above question's answer to scipy, I came across this issue and subsequently followed to this question. The latter question did give some insight into the wonderful world of scipy's affine transformation, but I have as yet been unable to crack my particular needs.

The transformations from src to dst can have translations and rotation. I can get translations only working (an example is shown below) and I can get rotations only working (largely hacking around the below and taking inspiration from the use of the reshape argument in scipy.ndimage.interpolation.rotate). However, I am getting thoroughly lost combining the two. I have tried to calculate what should be the correct offset (see this question's answers again), but I can't get it working in all scenarios.

Translation-only working example of padded affine transformation, which follows largely this repo, explained in this answer:

from scipy.ndimage import rotate, affine_transform
import numpy as np
import matplotlib.pyplot as plt

nblob = 50
shape = (200, 100)
buffered_shape = (300, 200)  # buffer for rotation and translation


def affine_test(angle=0, translate=(0, 0)):
    np.random.seed(42)
    # Maxiumum translation allowed is half difference between shape and buffered_shape

    # Generate a buffered_shape-sized base image with random blobs
    base = np.zeros(buffered_shape, dtype=np.float32)
    random_locs = np.random.choice(np.arange(2, buffered_shape[0] - 2), nblob * 2, replace=False)
    i = random_locs[:nblob]
    j = random_locs[nblob:]
    for k, (_i, _j) in enumerate(zip(i, j)):
        # Use different values, just to make it easier to distinguish blobs
        base[_i - 2 : _i + 2, _j - 2 : _j + 2] = k + 10

    # Impose a rotation and translation on source
    src = rotate(base, angle, reshape=False, order=1, mode="constant")
    bsc = (np.array(buffered_shape) / 2).astype(int)
    sc = (np.array(shape) / 2).astype(int)
    src = src[
        bsc[0] - sc[0] + translate[0] : bsc[0] + sc[0] + translate[0],
        bsc[1] - sc[1] + translate[1] : bsc[1] + sc[1] + translate[1],
    ]
    # Cut-out destination from the centre of the base image
    dst = base[bsc[0] - sc[0] : bsc[0] + sc[0], bsc[1] - sc[1] : bsc[1] + sc[1]]

    src_y, src_x = src.shape

    def get_matrix_offset(centre, angle, scale):
        """Follows OpenCV.getRotationMatrix2D"""
        angle = angle * np.pi / 180
        alpha = scale * np.cos(angle)
        beta = scale * np.sin(angle)
        return (
            np.array([[alpha, beta], [-beta, alpha]]),
            np.array(
                [
                    (1 - alpha) * centre[0] - beta * centre[1],
                    beta * centre[0] + (1 - alpha) * centre[1],
                ]
            ),
        )
    # Obtain the rotation matrix and offset that describes the transformation
    # between src and dst
    matrix, offset = get_matrix_offset(np.array([src_y / 2, src_x / 2]), angle, 1)
    offset = offset - translate

    # Determine the outer bounds of the new image
    lin_pts = np.array([[0, src_x, src_x, 0], [0, 0, src_y, src_y]])
    transf_lin_pts = np.dot(matrix.T, lin_pts) - offset[::-1].reshape(2, 1)

    # Find min and max bounds of the transformed image
    min_x = np.floor(np.min(transf_lin_pts[0])).astype(int)
    min_y = np.floor(np.min(transf_lin_pts[1])).astype(int)
    max_x = np.ceil(np.max(transf_lin_pts[0])).astype(int)
    max_y = np.ceil(np.max(transf_lin_pts[1])).astype(int)

    # Add translation to the transformation matrix to shift to positive values
    anchor_x, anchor_y = 0, 0
    if min_x < 0:
        anchor_x = -min_x
    if min_y < 0:
        anchor_y = -min_y
    shifted_offset = offset - np.dot(matrix, [anchor_y, anchor_x])

    # Create padded destination image
    dst_h, dst_w = dst.shape[:2]
    pad_widths = [anchor_y, max(max_y, dst_h) - dst_h, anchor_x, max(max_x, dst_w) - dst_w]
    dst_padded = np.pad(
        dst,
        ((pad_widths[0], pad_widths[1]), (pad_widths[2], pad_widths[3])),
        "constant",
        constant_values=-1,
    )
    dst_pad_h, dst_pad_w = dst_padded.shape

    # Create the aligned and padded source image
    source_aligned = affine_transform(
        src,
        matrix.T,
        offset=shifted_offset,
        output_shape=(dst_pad_h, dst_pad_w),
        order=3,
        mode="constant",
        cval=-1,
    )

    # Plot the images
    fig, axes = plt.subplots(1, 4, figsize=(10, 5), sharex=True, sharey=True)
    axes[0].imshow(src, cmap="viridis", vmin=-1, vmax=nblob)
    axes[0].set_title("Source")
    axes[1].imshow(dst, cmap="viridis", vmin=-1, vmax=nblob)
    axes[1].set_title("Dest")
    axes[2].imshow(source_aligned, cmap="viridis", vmin=-1, vmax=nblob)
    axes[2].set_title("Source aligned to Dest padded")
    axes[3].imshow(dst_padded, cmap="viridis", vmin=-1, vmax=nblob)
    axes[3].set_title("Dest padded")
    plt.show()

e.g.:

affine_test(0, (-20, 40))

gives:

With a zoom in showing the aligned in the padded images:

I require the full extent of the src and dst images aligned on the same pixel coordinates, with both rotations and translations.

Any help is greatly appreciated!

The function `affine_test` above will create `src` and `dst`, applying a rotation (`angle`) and translation (`translation`) based on arguments passed. — Jdog, Mar 21 '22 at 17:46
To me, your images are a bit confusing, as I don't see the alignment. Can you elaborate on them a bit so that I might find an answer? — Red, Mar 23 '22 at 20:39
In the images, "Source" is a translated version of "Dest": if you move "Source" -20 pixels in y and +40 pixels in x, you end up with "Dest". The padded versions may not look aligned, because the eye plays tricks, but they are perfectly aligned, and they have - **crucially** - been padded with values of -1 such that each respective image preserves their full initial footprint and that of the other image. In this sense if I stacked them on top of each other I would end up with an image that covered the union of their initial footprint. The bottom row is just a zoom in of this same row of images. — Jdog, Mar 24 '22 at 13:05

score 3 · Answer 1 · answered Mar 23 '22 at 20:32

Complexity analysis

The problem is to determine three parameters

Let's suppose that you have a grid for angle, x and y displacements, each with size O(n) and that your images are of size O(n x n) so, rotation, translation, and comparison of the images all take O(n^2), since you have O(n^3) candidate transforms to try, you end up with complexity O(n^5), and probably that's why you are asking the question.

However the part of the displacement can be computed slightly more efficiently by computing maximum correlation using Fourier transforms. The Fourier transforms can be performed with complexity O(n log n) each axis, and we have to perform them to the two spatial dimensions, the complete correlation matrix can be computed in O(n^2 log^2 n), then we find the maximum with complexity O(n^2), so the overall time complexity of determining the best alignment is O(n^2 log^2 n). However you still want to search for the best angle, since we have O(n) candidate angles the overall complexity of this search will be O(n^3 log^2 n). Remember we are using python and we may have some significant overhead, so this complexity only gives us an idea of how difficult it will be, and I have handled problems like this before so I start confident.

Preparing some example

I will start by downloading an image and applying rotation and centering the image padding with zeros.


def centralized(a, width, height):
    '''
    Image centralized to the given width and height
    by padding with zeros (black)
    '''
    assert width >= a.shape[0] and height >= a.shape[1]
    ap = np.zeros((width, height) + a.shape[2:], a.dtype)
    ccx = (width - a.shape[0])//2
    ccy = (height - a.shape[1])//2
    ap[ccx:ccx+a.shape[0], ccy:ccy+a.shape[1], ...] = a
    return ap
def image_pair(im, width, height, displacement=(0,0), angle=0):
    '''
    this build an a pair of images as numpy arrays
    from the input image.
    Both images will be padded with zeros (black)
    and roughly centralized.
    and will have the specified shape
    
    make sure that the width and height chosen are enough 
    to fit the rotated image
    '''
    a = np.array(im)
    a1 = centralized(a, width, height)
    a2 = centralized(ndimage.rotate(a, angle), width, height)
    a2 = np.roll(a2, displacement, axis=(0,1))
    return a1, a2

def random_transform():
    angle = np.random.rand() * 360
    displacement = np.random.randint(-100, 100, 2)
    return displacement, angle

a1, a2 = image_pair(im, 512, 512, *random_transform())
plt.subplot(121)
plt.imshow(a1)
plt.subplot(122)
plt.imshow(a2)

The displacement search

The first thing is to compute the correlation of the image

def compute_correlation(a1, a2):
    A1 = np.fft.rfftn(a1, axes=(0,1))
    A2 = np.fft.rfftn(a2, axes=(0,1))
    C = np.fft.irfftn(np.sum(A1 * np.conj(A2), axis=2))
    return C

Then, let's create an example without rotation and confirm that the with the index of the maximum correlation we can find the displacement that fit one image to the other.

displacement, _ = random_transform()
a1, a2 = image_pair(im, 521, 512, displacement, angle=0)
C = compute_correlation(a1, a2)
np.unravel_index(np.argmax(C), C.shape), displacement
a3 = np.roll(a2, np.unravel_index(np.argmax(C), C.shape), axis=(0,1))
assert np.all(a3 == a1)

With rotation or interpolation this result may not be exact but it gives the displacement that will give us the closest possible alignment.

Let's put this in a function for future use

def get_aligned(a1, a2, angle):
    a1_rotated = ndimage.rotate(a1, angle, reshape=False)
    C = compute_correlation(a2, a1_rotated)
    found_displacement = np.unravel_index(np.argmax(C), C.shape)
    a1_aligned = np.roll(a1_rotated, found_displacement, axis=(0,1))
    return a1_aligned

Searching for the angle

Now we can do something in two steps,

in one we compute the correlation for each angle, then with the angle that gives maximum correlation find the alignment.

displacement, angle = random_transform()
a1, a2 = image_pair(im, 521, 512, displacement, angle)
C_max = []
C_argmax = []
angle_guesses = np.arange(0, 360, 5)
for angle_guess in angle_guesses:
    a1_rotated = ndimage.rotate(a1, angle_guess, reshape=False)
    C = compute_correlation(a1_rotated, a2)
    i = np.argmax(C)
    v = C.reshape(-1)[i]
    C_max.append(v)
    C_argmax.append(i)

Let's see how the correlation looks like

plt.plot(angle_guesses, C_max);

We have a clear winner looking at this curve, even if a sunflower has some sort of rotation symmetry.

Let's apply the transformation to the original image and see how it looks like

a1_aligned = get_aligned(a1, a2, angle_guesses[np.argmax(C_max)])
plt.subplot(121)
plt.imshow(a2)
plt.subplot(122)
plt.imshow(a1_aligned)

Great, I wouldn't have done better than this manually.

I am using a sunflower image for beauty reasons, but the procedure is the same for any type of image. I use RGB showing that the image may have one additional dimension, i.e. it uses a feature vector, instead of the scalar feature, you can use reshape your data to (width, height, 1) if your feature is a scalar.

OK, I guess my question isn't as clear as I thought it was. I **already have** the transformation between my two images. I know this extremely accurately - in reality they are not the same image being aligned, and they are very large, so ffts become expensive and inaccurate. My sole concern is preserving the footprint of my respective images when I apply the transformation to place them on the same pixel coordinate system. Your example here has done that by just artificially padding the images by a large amount. This could be an option but the question correctly applying [1/2] — Jdog, Mar 24 '22 at 12:57
the affine_transform remains, given the interplay between the offset and the rotation matrix. Additionally, the images, as mentioned, are very large and padding them by an amount to cover the largest potential transformation is probably not feasible [2/2] — Jdog, Mar 24 '22 at 12:59
Can you please define more clealy what you mean by preserving footprint, and how you do a transformation that does not preserve footprint? — Bob, Mar 24 '22 at 14:22
Take the question's example. If I do not try to account for the transformation, when using `scipy.ndimage.affine_transform` on the source (`src`) image, I am returned a (100,200) shape image where the only part of the image that survives is that which overlaps with the destination (`dst`). I need to preserve all parts of source, whether they overlap with destination or not (and vice-versa) — Jdog, Mar 24 '22 at 14:43
So you want a transform that preserve skew and rotation but may translate the image so that it moves everything to the target image area? — Bob, Mar 24 '22 at 14:53
Sort of, yes. I need to alter the transformation, and pad the `dst` image, so the transformed `src` and padded `dst` are on the same pixel coordinates. The answer and repo of a openCV implemenation linked in the question show how this can be done, but, as the `scipy` issue also linked shows, `scipy` seems to do some weird stuff with the transformation such that the method, which works for OpenCV's affine transform, doesn't work. — Jdog, Mar 24 '22 at 15:09
But padding can only increase pixel coordinates. Suppose the pixel coordinates for aligned `dst` are larger than the pixel coordinates for `src`, how do you expect the pixel coordinates to be aligned? Also it looks like you are interested in points, coudn't pass the an array with the point coordinates and find a transformation to align them? Then you only work with coordinates and there is no heavy images manipulation. — Bob, Mar 25 '22 at 08:22

Jdog · Accepted Answer · 2022-03-28T11:54:45.367

Working code below in case anyone else has this need of scipy's affine transformations:

def affine_test(angle=0, translate=(0, 0), shape=(200, 100), buffered_shape=(300, 200), nblob=50):
    # Maxiumum translation allowed is half difference between shape and buffered_shape

    np.random.seed(42)

    # Generate a buffered_shape-sized base image
    base = np.zeros(buffered_shape, dtype=np.float32)
    random_locs = np.random.choice(np.arange(2, buffered_shape[0] - 2), nblob * 2, replace=False)
    i = random_locs[:nblob]
    j = random_locs[nblob:]
    for k, (_i, _j) in enumerate(zip(i, j)):
        base[_i - 2 : _i + 2, _j - 2 : _j + 2] = k + 10

    # Impose a rotation and translation on source
    src = rotate(base, angle, reshape=False, order=1, mode="constant")
    bsc = (np.array(buffered_shape) / 2).astype(int)
    sc = (np.array(shape) / 2).astype(int)
    src = src[
        bsc[0] - sc[0] + translate[0] : bsc[0] + sc[0] + translate[0],
        bsc[1] - sc[1] + translate[1] : bsc[1] + sc[1] + translate[1],
    ]
    # Cut-out destination from the centre of the base image
    dst = base[bsc[0] - sc[0] : bsc[0] + sc[0], bsc[1] - sc[1] : bsc[1] + sc[1]]

    src_y, src_x = src.shape

    def get_matrix_offset(centre, angle, scale):
        """Follows OpenCV.getRotationMatrix2D"""
        angle_rad = angle * np.pi / 180
        alpha = np.round(scale * np.cos(angle_rad), 8)
        beta = np.round(scale * np.sin(angle_rad), 8)
        return (
            np.array([[alpha, beta], [-beta, alpha]]),
            np.array(
                [
                    (1 - alpha) * centre[0] - beta * centre[1],
                    beta * centre[0] + (1 - alpha) * centre[1],
                ]
            ),
        )

    matrix, offset = get_matrix_offset(np.array([((src_y - 1) / 2) - translate[0], ((src_x - 1) / 2) - translate[
    1]]), angle, 1)

    offset += np.array(translate)

    M = np.column_stack((matrix, offset))
    M = np.vstack((M, [0, 0, 1]))
    iM = np.linalg.inv(M)
    imatrix = iM[:2, :2]
    ioffset = iM[:2, 2]

    # Determine the outer bounds of the new image
    lin_pts = np.array([[0, src_y-1, src_y-1, 0], [0, 0, src_x-1, src_x-1]])
    transf_lin_pts = np.dot(matrix, lin_pts) + offset.reshape(2, 1) # - np.array(translate).reshape(2, 1) # both?

    # Find min and max bounds of the transformed image
    min_x = np.floor(np.min(transf_lin_pts[1])).astype(int)
    min_y = np.floor(np.min(transf_lin_pts[0])).astype(int)
    max_x = np.ceil(np.max(transf_lin_pts[1])).astype(int)
    max_y = np.ceil(np.max(transf_lin_pts[0])).astype(int)

    # Add translation to the transformation matrix to shift to positive values
    anchor_x, anchor_y = 0, 0
    if min_x < 0:
        anchor_x = -min_x
    if min_y < 0:
        anchor_y = -min_y

    dot_anchor = np.dot(imatrix, [anchor_y, anchor_x])
    shifted_offset = ioffset - dot_anchor

    # Create padded destination image
    dst_y, dst_x = dst.shape[:2]
    pad_widths = [anchor_y, max(max_y, dst_y) - dst_y, anchor_x, max(max_x, dst_x) - dst_x]
    dst_padded = np.pad(
        dst,
        ((pad_widths[0], pad_widths[1]), (pad_widths[2], pad_widths[3])),
        "constant",
        constant_values=-10,
    )

    dst_pad_y, dst_pad_x = dst_padded.shape
    # Create the aligned and padded source image
    source_aligned = affine_transform(
        src,
        imatrix,
        offset=shifted_offset,
        output_shape=(dst_pad_y, dst_pad_x),
        order=3,
        mode="constant",
        cval=-10,
    )

E.g. running:

affine_test(angle=-25, translate=(10, -40))

will show:

and zoomed in:

Apologies the code is not nicely written as is.

Note that running this in the wild I notice it cannot handle any change in scale size of the images, but I am not certain it isn't something to do with how I calculate the transformation - so a caveat worth noting, and checking out, if you are aligning images with different scales.

You should consider awarding the bounty to another answer, because it can't go to the question asker, even if they answer), so that rep won't be wasted. — , Mar 26 '22 at 01:12
@richardec No, the rep won't be "wasted". Rep spent on a bounty is to attract more attention to a post, and it has done exactly that. — Red, Mar 27 '22 at 13:22
@Jdog How nice, it's not everyday that a user would go out of their way to answer their own question for the benefit of future readers! By the way, can you fix the indentation of the first function? Thanks :) — Red, Mar 27 '22 at 13:29

Ziur Olpa · Answer 3 · 2022-03-22T16:44:49.243

If you have two images that are similar (or the same) and you want to align them, you can do it using both functions rotate and shift :

from scipy.ndimage import rotate, shift

You need to find first the difference of angle between the two images angle_to_rotate, having that you apply a rotation to src:

angle_to_rotate = 25
rotated_src = rotate(src, angle_to_rotate , reshape=True, order=1, mode="constant")

With reshape=True you avoid losing information from your original src matrix, and it pads the result so the image could be translated around the 0,0 indexes. You can calculate this translation as it is (x*cos(angle),y*sin(angle) where x and y are the dimensions of the image, but it probably won't matter.

Now you will need to translate the image to the source, for doing that you can use the shift function:

rot_translated_src = shift(rotated_src , [distance_x, distance_y])

In this case there is no reshape (because otherwise you wouldn't have any real translation) so if the image was not previously padded some information will be lost.

But you can do some padding with

np.pad(src, number, mode='constant')

To calculate distance_x and distance_y you will need to find a point that serves you as a reference between the rotated_src and the destination, then just calculate the distance in the x and y axis.

Summary

Make some padding in src, and dst
Find the angular distance between them.
Rotate src with scipy.ndimage.rotate using reshape=True
Find the horizontal and vertical distance distance_x, distance_y between the rotated image and dst
Translate your 'rotated_src' with scipy.ndimage.shift

Code

from scipy.ndimage import rotate, shift
import matplotlib.pyplot as plt
import numpy as np

First we make the destination image:

# make and plot dest
dst = np.ones([40,20])
dst = np.pad(dst,10)
dst[17,[14,24]]=4
dst[27,14:25]=4
dst[26,[14,25]]=4
rotated_dst = rotate(dst, 20, order=1)

plt.imshow(dst) # plot it
plt.imshow(rotated_dst)
plt.show()

We make the Source image:

# make_src image and plot it
src = np.zeros([40,20])
src = np.pad(src,10)
src[0:20,0:20]=1
src[7,[4,14]]=4
src[17,4:15]=4
src[16,[4,15]]=4
plt.imshow(src)
plt.show()

Then we align the src to the destination:

rotated_src = rotate(src, 20, order=1) # find the angle 20, reshape true is by default
plt.imshow(rotated_src)
plt.show()
distance_y = 8 # find this distances from rotated_src and dst
distance_x = 12 # use any visual reference or even the corners
translated_src = shift(rotated_src, [distance_y,distance_x])
plt.imshow(translated_src)
plt.show()

pd: If you find problems to find the angle and the distances in a programmatic way, please leave a comment providing a bit more of insight of what can be used as a reference that could be for example the frame of the image or some image features / data)

Padding scipy affine_transform output to show non-overlapping regions of transformed images

3 Answers3

Complexity analysis

Preparing some example

The displacement search

Searching for the angle