18

This is a basic transform question in PIL. I've tried at least a couple of times in the past few years to implement this correctly and it seems there is something I don't quite get about Image.transform in PIL. I want to implement a similarity transformation (or an affine transformation) where I can clearly state the limits of the image. To make sure my approach works I implemented it in Matlab.

The Matlab implementation is the following:

im = imread('test.jpg');
y = size(im,1);
x = size(im,2);
angle = 45*3.14/180.0;
xextremes = [rot_x(angle,0,0),rot_x(angle,0,y-1),rot_x(angle,x-1,0),rot_x(angle,x-1,y-1)];
yextremes = [rot_y(angle,0,0),rot_y(angle,0,y-1),rot_y(angle,x-1,0),rot_y(angle,x-1,y-1)];
m = [cos(angle) sin(angle) -min(xextremes); -sin(angle) cos(angle) -min(yextremes); 0 0 1];
tform = maketform('affine',m')
round( [max(xextremes)-min(xextremes), max(yextremes)-min(yextremes)])
im = imtransform(im,tform,'bilinear','Size',round([max(xextremes)-min(xextremes), max(yextremes)-min(yextremes)]));
imwrite(im,'output.jpg');

function y = rot_x(angle,ptx,pty),
    y = cos(angle)*ptx + sin(angle)*pty

function y = rot_y(angle,ptx,pty),
    y = -sin(angle)*ptx + cos(angle)*pty

this works as expected. This is the input:

enter image description here

and this is the output:

enter image description here

This is the Python/PIL code that implements the same transformation:

import Image
import math

def rot_x(angle,ptx,pty):
    return math.cos(angle)*ptx + math.sin(angle)*pty

def rot_y(angle,ptx,pty):
    return -math.sin(angle)*ptx + math.cos(angle)*pty

angle = math.radians(45)
im = Image.open('test.jpg')
(x,y) = im.size
xextremes = [rot_x(angle,0,0),rot_x(angle,0,y-1),rot_x(angle,x-1,0),rot_x(angle,x-1,y-1)]
yextremes = [rot_y(angle,0,0),rot_y(angle,0,y-1),rot_y(angle,x-1,0),rot_y(angle,x-1,y-1)]
mnx = min(xextremes)
mxx = max(xextremes)
mny = min(yextremes)
mxy = max(yextremes)
im = im.transform((int(round(mxx-mnx)),int(round((mxy-mny)))),Image.AFFINE,(math.cos(angle),math.sin(angle),-mnx,-math.sin(angle),math.cos(angle),-mny),resample=Image.BILINEAR)
im.save('outputpython.jpg')

and this is the output from Python:

enter image description here

I've tried this with several versions of Python and PIL on multiple OSs through the years and the results is always mostly the same.

This is the simplest possible case that illustrates the problem, I understand that if it was a rotation I wanted, I could do the rotation with the im.rotate call but I want to shear and scale too, this is just an example to illustrate a problem. I would like to get the same output for all affine transformations. I would like to be able to get this right.

EDIT:

If I change the transform line to this:

im = im.transform((int(round(mxx-mnx)),int(round((mxy-mny)))),Image.AFFINE,(math.cos(angle),math.sin(angle),0,-math.sin(angle),math.cos(angle),0),resample=Image.BILINEAR)

this is the output I get:

enter image description here

EDIT #2

I rotated by -45 degrees and changed the offset to -0.5*mnx and -0.5*mny and obtained this:

enter image description here

Erlend Graff
  • 1,098
  • 16
  • 27
carlosdc
  • 12,022
  • 4
  • 45
  • 62
  • 1
    Is it possible that the (0,0) spatial location of an image is defined differently for python and matlab? For matlab (0,0) is the upper left corner of the image. Could it be that for python it is the center of the image? What would happen if you omit the translation part of the transformation in python (i.e., without `-mnx` and `-mny`)? – user2469775 Jun 12 '13 at 05:33
  • @user2469775: I've tried what you suggested and got a new output, I've edited the question. – carlosdc Jun 12 '13 at 17:04
  • so it seems like (0,0) is in the middle of the image. Can you please try: `Image.AFFINE(math.cos(angle),math.sin(angle),-.5*mnx,-math.sin(angle),math.cos(angle),-.5*mny)`? – Shai Jun 12 '13 at 17:54
  • also, you might need to work with `-angle` instead of `angle`. – Shai Jun 12 '13 at 17:55
  • @Shai: I tried what you suggest and edited the question with the results I got. – carlosdc Jun 12 '13 at 18:39
  • I guess my guesses are as good as yours. I believe at this point trail and error will give you the proper result. Once you'll get there, I believe it would be easier to "reverse-engineer" the matrix to understand the behavior of PIL – Shai Jun 12 '13 at 19:01
  • @Shai: Thanks! This comes up every now and then in my work. I'm always able to work around it, but never with a principled solution I would want (or like the one I get in Matlab). – carlosdc Jun 12 '13 at 19:09

4 Answers4

21

OK! So I've been working on understanding this all weekend and I think I have an answer that satisfies me. Thank you all for your comments and suggestions!

I start by looking at this:

affine transform in PIL python?

while I see that the author can make arbitrary similarity transformations it does not explain why my code was not working, nor does he explain the spatial layout of the image that we need to transform nor does he provide a linear algebraic solution to my problems.

But I do see from his code I do see that he's dividing the rotation part of the matrix (a,b,d and e) into the scale which struck me as odd. I went back to read the PIL documentation which I quote:

"im.transform(size, AFFINE, data, filter) => image

Applies an affine transform to the image, and places the result in a new image with the given size.

Data is a 6-tuple (a, b, c, d, e, f) which contain the first two rows from an affine transform matrix. For each pixel (x, y) in the output image, the new value is taken from a position (a x + b y + c, d x + e y + f) in the input image, rounded to nearest pixel.

This function can be used to scale, translate, rotate, and shear the original image."

so the parameters (a,b,c,d,e,f) are a transform matrix, but the one that maps (x,y) in the destination image to (a x + b y + c, d x + e y + f) in the source image. But not the parameters of the transform matrix you want to apply, but its inverse. That is:

  • weird
  • different than in Matlab
  • but now, fortunately, fully understood by me

I'm attaching my code:

import Image
import math
from numpy import matrix
from numpy import linalg

def rot_x(angle,ptx,pty):
    return math.cos(angle)*ptx + math.sin(angle)*pty

def rot_y(angle,ptx,pty):
    return -math.sin(angle)*ptx + math.cos(angle)*pty

angle = math.radians(45)
im = Image.open('test.jpg')
(x,y) = im.size
xextremes = [rot_x(angle,0,0),rot_x(angle,0,y-1),rot_x(angle,x-1,0),rot_x(angle,x-1,y-1)]
yextremes = [rot_y(angle,0,0),rot_y(angle,0,y-1),rot_y(angle,x-1,0),rot_y(angle,x-1,y-1)]
mnx = min(xextremes)
mxx = max(xextremes)
mny = min(yextremes)
mxy = max(yextremes)
print mnx,mny
T = matrix([[math.cos(angle),math.sin(angle),-mnx],[-math.sin(angle),math.cos(angle),-mny],[0,0,1]])
Tinv = linalg.inv(T);
print Tinv
Tinvtuple = (Tinv[0,0],Tinv[0,1], Tinv[0,2], Tinv[1,0],Tinv[1,1],Tinv[1,2])
print Tinvtuple
im = im.transform((int(round(mxx-mnx)),int(round((mxy-mny)))),Image.AFFINE,Tinvtuple,resample=Image.BILINEAR)
im.save('outputpython2.jpg')

and the output from python:

enter image description here

Let me state the answer to this question again in a final summary:

PIL requires the inverse of the affine transformation you want to apply.

Community
  • 1
  • 1
carlosdc
  • 12,022
  • 4
  • 45
  • 62
10

I wanted to expand a bit on the answers by carlosdc and Ruediger Jungbeck, to present a more practical python code solution with a bit of explanation.

First, it is absolutely true that PIL uses inverse affine transformations, as stated in carlosdc's answer. However, there is no need to use linear algebra to compute the inverse transformation from the original transformation—instead, it can easily be expressed directly. I'll use scaling and rotating an image about its center for the example, as in the code linked to in Ruediger Jungbeck's answer, but it's fairly straightforward to extend this to do e.g. shearing as well.

Before approaching how to express the inverse affine transformation for scaling and rotating, consider how we'd find the original transformation. As hinted at in Ruediger Jungbeck's answer, the transformation for the combined operation of scaling and rotating is found as the composition of the fundamental operators for scaling an image about the origin and rotating an image about the origin.

However, since we want to scale and rotate the image about its own center, and the origin (0, 0) is defined by PIL to be the upper left corner of the image, we first need to translate the image such that its center coincides with the origin. After applying the scaling and rotation, we also need to translate the image back in such a way that the new center of the image (it might not be the same as the old center after scaling and rotating) ends up in the center of the image canvas.

So the original "standard" affine transformation we're after will be the composition of the following fundamental operators:

  1. Find the current center (c_x, c_y) of the image, and translate the image by (-c_x, -c_y), so the center of the image is at the origin (0, 0).

  2. Scale the image about the origin by some scale factor (s_x, s_y).

  3. Rotate the image about the origin by some angle \theta.

  4. Find the new center (t_x, t_y) of the image, and translate the image by (t_x, t_y) so the new center will end up in the center of the image canvas.

To find the transformation we're after, we first need to know the transformation matrices of the fundamental operators, which are as follows:

  • Translation by (x, y):
  • Scaling by (s_x, s_y):
  • Rotation by \theta:

Then, our composite transformation can be expressed as:

which is equal to

or

where

.

Now, to find the inverse of this composite affine transformation, we just need to calculate the composition of the inverse of each fundamental operator in reverse order. That is, we want to

  1. Translate the image by (-t_x, -t_y)

  2. Rotate the image about the origin by -\theta.

  3. Scale the image about the origin by (1/s_x, 1/s_y).

  4. Translate the image by (c_x, c_y).

This results in a transformation matrix

where

.

This is exactly the same as the transformation used in the code linked to in Ruediger Jungbeck's answer. It can be made more convenient by reusing the same technique that carlosdc used in their post for calculating (t_x, t_y) of the image, and translate the image by (t_x, t_y)—applying the rotation to all four corners of the image, and then calculating the distance between the minimum and maximum X and Y values. However, since the image is rotated about its own center, there's no need to rotate all four corners, since each pair of oppositely facing corners are rotated "symmetrically".

Here is a rewritten version of carlosdc's code that has been modified to use the inverse affine transformation directly, and which also adds scaling:

from PIL import Image
import math


def scale_and_rotate_image(im, sx, sy, deg_ccw):
    im_orig = im
    im = Image.new('RGBA', im_orig.size, (255, 255, 255, 255))
    im.paste(im_orig)

    w, h = im.size
    angle = math.radians(-deg_ccw)

    cos_theta = math.cos(angle)
    sin_theta = math.sin(angle)

    scaled_w, scaled_h = w * sx, h * sy

    new_w = int(math.ceil(math.fabs(cos_theta * scaled_w) + math.fabs(sin_theta * scaled_h)))
    new_h = int(math.ceil(math.fabs(sin_theta * scaled_w) + math.fabs(cos_theta * scaled_h)))

    cx = w / 2.
    cy = h / 2.
    tx = new_w / 2.
    ty = new_h / 2.

    a = cos_theta / sx
    b = sin_theta / sx
    c = cx - tx * a - ty * b
    d = -sin_theta / sy
    e = cos_theta / sy
    f = cy - tx * d - ty * e

    return im.transform(
        (new_w, new_h),
        Image.AFFINE,
        (a, b, c, d, e, f),
        resample=Image.BILINEAR
    )


im = Image.open('test.jpg')
im = scale_and_rotate_image(im, 0.8, 1.2, 10)
im.save('outputpython.png')

and this is what the result looks like (scaled with (sx, sy) = (0.8, 1.2), and rotated 10 degrees counter-clockwise):

Scaled and rotated

Erlend Graff
  • 1,098
  • 16
  • 27
  • 1
    @Colbi sorry, but your calculation is wrong ;) You cannot simply include the `-tx` and `-ty` terms into the matrix together with `a`, `b`, `d`, and `e` like that. Also, you are multiplying the matrices in the wrong order. See the [full](https://imgur.com/a/5z4D1wI) or [simplified](https://imgur.com/a/87L3ymH) calculation. – Erlend Graff Apr 11 '21 at 10:06
1

I think this should answer your question.

If not, you should could consider that affine transformations could be concatenated into another transformation.

So you could split your desired operation into:

  1. Moving the orgin to center of the image

  2. Rotating

  3. Moving the origin back

  4. Resizing

You could than compute a single transformation out this.

Community
  • 1
  • 1
Ruediger Jungbeck
  • 2,836
  • 5
  • 36
  • 59
0

The Image is rotated around a center point. The center of the PIL Image coordinate system (0, 0) is the top left corner.

If you use a product of matrices to construct your affine transformation I suggest adding a temporary centering/decentering transform.

We construct the affine transformation from the following basic blocks

import numpy as np

def translation(x, y):
    mat = np.eye(3)
    mat[0, 2] = x
    mat[1, 2] = y
    return mat

def scaling(s):
    mat = np.eye(3)
    mat[0, 0] = s
    mat[1, 1] = s
    return mat

def rotation(degree):
    mat = np.eye(3)
    rad = np.deg2rad(degree)
    mat[0, 0] = np.cos(rad)
    mat[0, 1] = -np.sin(rad)
    mat[1, 0] = np.sin(rad)
    mat[1, 1] = np.cos(rad)
    return mat

def tmp_center(w, h):
    mat = np.eye(3)
    mat[0, 2] = -w/2
    mat[1, 2] = -h/2
    return mat

Then load an image, and define the transformation. Different to other libraries, make sure to use the inverse as others have pointed out.

from PIL import Image
img = Image.from_array(...)
w, h = img.size
T = translation(20, 23) @ tmp_center(-w, -h) @ rotation(5) @ scaling(0.69) @ tmp_center(w, h)
coeff = np.linalg.inv(T).flatten()[:6]

out = img.transform(img.size, Image.AFFINE, coeff, resample.Image.BILINEAR)
0-_-0
  • 1,313
  • 15
  • 15