How can I use scipy's affine_transform to do an arbitrary affine transformation on a color image?

Question

My goal is to transform an image in such a way that three source points are mapped to three target points in an empty array. I have solved the finding of the correct affine matrix, however I cannot apply an affine transformation on a color image.

More specifically, I am struggling with the correct use of the scipy.ndimage.interpolation.affine_transform method. As this question and it's anwers point out, the affine_transform-method can be somewhat unintuitive (especially regarding offset calculation), however, user timday shows how apply a rotation and a shearing on an image and position it in another array, while user geodata gives more background information.

My problem is to generalize the approach shown there (1) to color images and (2) to an arbitrary transformation which I calculated myself.

This is my code (which should run as is on your computer):

import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt


def calcAffineMatrix(sourcePoints, targetPoints):
    # For three source- and three target points, find the affine transformation
    # Function works correctly, not part of the question
    A = []
    b = []
    for sp, trg in zip(sourcePoints, targetPoints):
        A.append([sp[0], 0, sp[1], 0, 1, 0])
        A.append([0, sp[0], 0, sp[1], 0, 1])
        b.append(trg[0])
        b.append(trg[1])
    result, resids, rank, s = np.linalg.lstsq(np.array(A), np.array(b))

    a0, a1, a2, a3, a4, a5 = result
    # Ignoring offset here, later use timday's suggested offset calculation
    affineTrafo = np.array([[a0, a1, 0], [a2, a3, 0], [0, 0, 1]], 'd')

    # Testing the correctness of transformation matrix
    for i, _ in enumerate(sourcePoints):
        src = sourcePoints[i]
        src.append(1.)
        trg = targetPoints[i]
        trg.append(1.)
        at = affineTrafo.copy()
        at[2, 0:2] = [a4, a5]
        assert(np.array_equal(np.round(np.array(src).dot(at)), np.array(trg)))
    return affineTrafo


# Prepare source image
sourcePoints = [[162., 112.], [130., 112.], [162., 240.]]
targetPoints = [[180., 102.], [101., 101.], [190., 200.]]
image = np.empty((300, 300, 3), dtype='uint8')
image[:] = 255
# Mark border for better visibility
image[0:2, :] = 0
image[-3:-1, :] = 0
image[:, 0:2] = 0
image[:, -3:-1] = 0
# Mark source points in red
for sp in sourcePoints:
    sp = [int(u) for u in sp]
    image[sp[1] - 5:sp[1] + 5, sp[0] - 5:sp[0] + 5, :] = np.array([255, 0, 0])

# Show image
plt.subplot(3, 1, 1)
plt.imshow(image)

# Prepare array in which the image is placed
array = np.empty((400, 300, 3), dtype='uint8')
array[:] = 255
a2 = array.copy()
# Mark target points in blue
for tp in targetPoints:
    tp = [int(u) for u in tp]
    a2[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

# Show array
plt.subplot(3, 1, 2)
plt.imshow(a2)

# Next 5 program lines are actually relevant for question:

# Calculate affine matrix
affineTrafo = calcAffineMatrix(sourcePoints, targetPoints)

# This follows the c_in-c_out method proposed in linked stackoverflow issue
# extended for color channel (no translation here)
c_in = np.array([sourcePoints[0][0], sourcePoints[0][1], 0])
c_out = np.array([targetPoints[0][0], targetPoints[0][1], 0])
offset = (c_in - np.dot(c_out, affineTrafo))

# Affine transform!
ndimage.interpolation.affine_transform(image, affineTrafo, order=2, offset=offset,
                                       output=array, output_shape=array.shape,
                                       cval=255)
# Mark blue target points in array, expected to be above red source points
for tp in targetPoints:
    tp = [int(u) for u in tp]
    array[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

plt.subplot(3, 1, 3)
plt.imshow(array)

plt.show()

Other approaches I tried include working with the inverse, transpose or both of affineTrafo:

affineTrafo = np.linalg.inv(affineTrafo)
affineTrafo = affineTrafo.T
affineTrafo = np.linalg.inv(affineTrafo.T)
affineTrafo = np.linalg.inv(affineTrafo).T

In his answer, geodata shows how to calculate the matrix that affine_trafo needs to do a scaling and rotation:

If one wants a scaling S first and then a rotation R it holds that T=R*S and therefore T.inv=S.inv*R.inv (note the reversed order).

Which I tried to copy using matrix decomposition (decomposing my affine transformation into a rotation, a shearing and another rotation):

u, s, v = np.linalg.svd(affineTrafo[:2,:2])
uInv = np.linalg.inv(u)
sInv = np.linalg.inv(np.diag((s)))
vInv = np.linalg.inv(v)
affineTrafo[:2, :2] = uInv.dot(sInv).dot(vInv)

Again, without success.

For all of my results, it's not (only) an offset problem. It is clearly visible from the pictures that the relative positions of source and target points do not correspond.

I searched the web and stackoverflow and did not find an answer for my problem. Please help me! :)

My answer [here](https://stackoverflow.com/questions/44457064/displaying-stitched-images-together-without-cutoff-using-warpaffine/44459869#44459869) is related and might help you understand what this `offset` is and how to calculate it. — alkasm, Jun 21 '17 at 12:54
@AlexanderReynolds Thank you, I've read your answer, but the problem is _earlier_ than the offset. Did you try to run the code? You will see the transformation goes totally wrong, not only the offset. Blue and red points should overlay but don't even have the right relative positioning. — Manu CJ, Jun 21 '17 at 14:08
Yes, but I have no idea what's going on. The documentation is sorely lacking. It's not clear whether the locations are being computed with a pre- or post-multiplication (who knows whether to use your transform or the inverse), when the offset is applied, or what relation the warped points have to the coordinates of the destination image. I can tell you that you're computing `c_in` and `c_out` wrong, you will not get proper pixel locations with a `0` at the end (they should be homogeneous points like my answer says, which are undefined with a `0` at the end). Not the main issue though. — alkasm, Jun 21 '17 at 14:14
Even though this issue *is* solvable with `scipy` I would *highly* suggest using `OpenCV` for tasks like this. This is two lines of code in OpenCV; using your variable names from this program: `affineTrafo = cv2.getAffineTransform(src_pts, trg_pts); array = cv2.warpAffine(image, affineTrafo, array.shape)`. — alkasm, Jun 21 '17 at 14:18
Yeah, sorry mate---I've gone through this for like an hour and I have no idea how any of this is working. And I've implemented these routines completely from scratch. I would just do the warping and interpolation manually. — alkasm, Jun 21 '17 at 14:26
@AlexanderReynolds Wow, thanks for your effort! `affine_transform` is really a mystery, isn't it? Okay, I'll try `OpenCV` and see if it works better! — Manu CJ, Jun 21 '17 at 14:44
@AlexanderReynolds Concerning the offset calculation: I closely followed geodata's suggestion in the linked post (which works in his case), but had to extend it for three channels. Maybe this is where I got it wrong. — Manu CJ, Jun 21 '17 at 14:46
It's not because you have three channels. You're warping one point from `src` and calculating the distance between it and the corresponding point from `trg` and using that to calculate the offset. So you need to do the proper computation, where you use homogeneous coordinates and divide by the last pixel. However, I cannot figure out *why* offset requires a third value (it's not like you offset into the third dimension, what??) You also shouldn't be using an offset at all. That's only for if you want to *move* the warp (like if you padded the destination and needed to move the warp there). — alkasm, Jun 21 '17 at 14:53
Also, you shouldn't rely on that calculation *instead* of using your actual calculated affine transformation. The one using your calculated transformation takes into account all points, the other one only takes into account a single point (which was already used with the other points). It doesn't make any sense to do that. — alkasm, Jun 21 '17 at 15:00
@AlexanderReynolds Stackoverflow is already warning me about the length of these comments. So last one from my side ;). I thought we need a third offset value because affine_transform is not aware that this is an image, and the third dimension represents channels which shall not be moved at all. And very valid point in the latest comment, I really should not calculate offset from only one point (which becomes more relevant when working with more than three points and approximate solutions)! Btw, `OpenCV` made it work almost immediately, thanks for the hint. Will post the code later. — Manu CJ, Jun 21 '17 at 15:19
Oh, right---I suppose an affine transformation could be used to move a 3D object and thus could be offset in that third dimension. And I'm glad you got up and running with OpenCV so fast, the install *can* be a pain sometimes. — alkasm, Jun 21 '17 at 16:10
@AlexanderReynolds Thanks again for your help! I posted the working code below. — Manu CJ, Jun 21 '17 at 18:03

score 1 · Accepted Answer · answered Jun 21 '17 at 16:22

I finally got it working thanks to AlexanderReynolds hint to use another library. This is of course a workaround; I could not get it working using scipy's affine_transform, so I used OpenCVs cv2.warpAffine instead. In case this is helpful to anyone else, this is my code:

import numpy as np
import matplotlib.pyplot as plt
import cv2

# Prepare source image
sourcePoints = [[162., 112.], [130., 112.], [162., 240.]]
targetPoints = [[180., 102.], [101., 101.], [190., 200.]]
image = np.empty((300, 300, 3), dtype='uint8')
image[:] = 255
# Mark border for better visibility
image[0:2, :] = 0
image[-3:-1, :] = 0
image[:, 0:2] = 0
image[:, -3:-1] = 0
# Mark source points in red
for sp in sourcePoints:
    sp = [int(u) for u in sp]
    image[sp[1] - 5:sp[1] + 5, sp[0] - 5:sp[0] + 5, :] = np.array([255, 0, 0])

# Show image
plt.subplot(3, 1, 1)
plt.imshow(image)

# Prepare array in which the image is placed
array = np.empty((400, 300, 3), dtype='uint8')
array[:] = 255
a2 = array.copy()
# Mark target points in blue
for tp in targetPoints:
    tp = [int(u) for u in tp]
    a2[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

# Show array
plt.subplot(3, 1, 2)
plt.imshow(a2)

# Calculate affine matrix and transform image
M = cv2.getAffineTransform(np.float32(sourcePoints), np.float32(targetPoints))
array = cv2.warpAffine(image, M, array.shape[:2], borderValue=[255, 255, 255])

# Mark blue target points in array, expected to be above red source points
for tp in targetPoints:
    tp = [int(u) for u in tp]
    array[tp[1] - 2:tp[1] + 2, tp[0] - 2:tp[0] + 2] = [0, 0, 255]

plt.subplot(3, 1, 3)
plt.imshow(array)

plt.show()

Comments:

Interesting how it worked almost immediately after changing the library. After having spent more than a day trying to get it work with scipy, this is a lesson for myself to change libraries faster.
In case someone wants to find an (least squares) approximation for an affine transformation based on more than three points, this is how you get the matrix that works with cv2.warpAffine:

Code:

def calcAffineMatrix(sourcePoints, targetPoints):
    # For three or more source and target points, find the affine transformation
    A = []
    b = []
    for sp, trg in zip(sourcePoints, targetPoints):
        A.append([sp[0], 0, sp[1], 0, 1, 0])
        A.append([0, sp[0], 0, sp[1], 0, 1])
        b.append(trg[0])
        b.append(trg[1])
    result, resids, rank, s = np.linalg.lstsq(np.array(A), np.array(b))

    a0, a1, a2, a3, a4, a5 = result
    affineTrafo = np.float32([[a0, a2, a4], [a1, a3, a5]])
    return affineTrafo

Just to give two quick notes in case you weren't aware: in OpenCV, image color channels are in BGR, as opposed to the normal RGB---not an issue for image warping or anything like that, but it might trip you up at some point (e.g. if you read an image with OpenCV but display with `Matplotlib` you need to convert from BGR to RGB). Also you can find full `(3,3)` homographies in OpenCV with four points using `cv2.getPerspectiveTransform()` or with `cv2.findHomography()` if you have four points or more (finds the best homography between all possible ones). — alkasm, Jun 21 '17 at 20:28
Thanks, very helpful! :) Actually thinking about implementing the Moving Least Squares method of Schaefer et al. 2006, which should give me a more realistic image transform. — Manu CJ, Jun 22 '17 at 09:13
That would work well, but it would definitely be slow compared to the methods built into OpenCV. The usual approach nowadays is generate feature matches with SIFT, ORB, etc and then toss them into `findHomography` which uses RANSAC to find the best possible homographies from all the possible feature matches. But either way, it's fun to apply the methods straight from a paper to code. — alkasm, Jun 22 '17 at 09:18

How can I use scipy's affine_transform to do an arbitrary affine transformation on a color image?

1 Answers1