1

Motivation: suppose that I have an RGB image J and I want to apply a transformation T (like a rotation) to pixels of J. I will create a new black image K that its pixels are related to J by K[x,y]=J[T[x,y]]. Now the problem is that T[x,y] has to be inside J and if I want to capture the transformed image of J completely, I might have to deal with some negative values of x or y or values that are larger than the size of J. So, first I have to determine the size of K and then shift the pixels of K by an appropriate vector to avoid negative values.

Now, suppose that I have determined the appropriate translation vector. I want to do a coordinate translation that sends (x,y) to (x+a, y+k).

Goal: Using for loops, what I want to do is the following:

for i in range(0,J.shape[0]):
    for j in range(0, J.shape[1]):
        K[i+a,j+b] = J[T[i,j]]

How can I do this in a vectorized way? Any help is appreciated.


Edit:

img = face() # dummy RGB data

i,j = np.mgrid[:img.shape[0], :img.shape[1]] # 2d arrays each
i_min, i_max, j_min, j_max = func(*) # assume that these values have been found
i = i + i_min
j = j + j_min
T = np.array([[1, -1],[1, 1]])/np.sqrt(2)
inew,jnew = np.linalg.inv(T) @ [i.ravel(), j.ravel()] # 1d arrays each

inew = np.floor(inew).astype(int)
jnew = np.floor(jnew).astype(int)

out = np.zeros((i_max - i_min, j_max - j_min, 3), dtype=img.dtype)

for i in inew:
    for j in jnew:
        out[i-i_min,j-j_min, :] = img[i,j,:]

Now I want to cancel the effect of shifting by i_min and j_min in the array out just like the code I wrote using for-loops.

stressed out
  • 522
  • 6
  • 24
  • Can you provide a short example of J, T and the desired output K? It would help. – Andreas K. Oct 31 '18 at 15:45
  • @AndyK Sure. A concrete scenario is like what I said: for example, take J to be a square (M,M,3) photo. Take T to be a rotation by 45 degrees. Then K should be a photo with all its entries equal to 0 but its size should change to contain the rotated image: its height and its width should be sqrt(2) times larger than the original image. The main issue is that in some transformations, some negative values (x,y) in K might be mapped to perfectly good tuples that lie inside our image J. I want to capture those pixels in my transformed image too. Does it make sense? – stressed out Oct 31 '18 at 16:00
  • I am not sure, but have a look here https://stackoverflow.com/questions/25458442/rotate-a-2d-image-around-specified-origin-in-python to see how to rotate an image – Andreas K. Oct 31 '18 at 16:10
  • @AndyK Thanks for the help but I'm not rotating an image. Actually, I'm doing a projective transformation. So, T in the question is actually a homography matrix. My earlier comment was just an example for you to see what I'm looking for as requested. – stressed out Oct 31 '18 at 16:13
  • I think I know how to vectorize your example double loop but I don't think it does what you want it to do. If I understand correctly, in your general case the input pixels from `[1:N, 1:M]` (so to speak) are transformed somewhere else on the "index plane", so we shift the result so that it contains only non-negative indices and assign to those new indices of an empty auxiliary array `K`. My first issue is a theoretical one: even in your 45-degree rotation example it's clear that pixels don't transform into pixels. The new grid doesn't align with an x/y grid. This is even more true in general. – Andras Deak -- Слава Україні Oct 31 '18 at 17:39
  • 1
    @AndrasDeak What do you mean by 'pixels don't transform into pixels'? Do you mean that their coordinates don't need to be integer? We can apply the floor function if that's what you mean. – stressed out Nov 01 '18 at 11:06
  • Regarding your update: it still isn't complete. I suspect you need `i_min, i_max, j_min, j_max = i.min(), i.max(), j.min(), j.max()`, but if I fix that I get an `IndexError`. Plus now you have vectorized code...what I'd suggest is doing the slow loopy version with the above dummy data to show what you want to vectorize. – Andras Deak -- Слава Україні Nov 01 '18 at 13:13
  • @AndrasDeak i_min, i_max, j_min, j_max are obtained by transforming the corners of the image in the forward direction. After that, we move in the backward direction. I don't understand what you mean by 'show what you want to vectorize'. :| – stressed out Nov 01 '18 at 13:16
  • Your original question was "how do I vectorize this double loop?", but the code was incomplete so it wasn't clear what the loop was actually doing. I'm suggesting that you show the complete loopy solution to the above example of 45 rotation, so that it's clear. _Or_, if you have a vectorized version, make it so that runs without errors and clarify what's wrong with it still. – Andras Deak -- Слава Україні Nov 01 '18 at 13:18
  • @AndrasDeak I edited my question and added the loopy version to the code. Is it clear now? – stressed out Nov 01 '18 at 13:23
  • Still not complete. Aren't the extremal indices basically `i_min, i_max, j_min, j_max = inew.min(), inew.max(), jnew.min(), jnew.max()`? Is it hard to actually compute those indices to get a runnable example? Anyway, I think I can try vectorizing based on this. – Andras Deak -- Слава Україні Nov 01 '18 at 13:36
  • Hmm, are those minmax indices computed so that after the transformation you get exactly nonnegative values starting from 0? – Andras Deak -- Слава Україні Nov 01 '18 at 13:38
  • @AndrasDeak No to your first question. They're found based on a forward transformation, not a backward transformation. Consider them as constants given to us for now, if you may. "Yes, exactly!" to your second question. We first need to consider negative pixels to do the backward transformation correctly. After we're done with the transformation, we want only nonnegative indices. Is it clear now? – stressed out Nov 01 '18 at 13:44
  • 1
  • @AndrasDeak Thank you for your patience with me. I appreciate it. :) – stressed out Nov 01 '18 at 13:46

2 Answers2

2

Naive version

As I understand your question: you have an input image, you transform its pixel positions, and want to put the result into a larger array that can accommodate it. Here's how I'd do that:

import numpy as np
import matplotlib.pyplot as plt # for plotting the result
from scipy.misc import face # for dummy data
img = face() # dummy RGB data

# transform pixels by 45 degrees
i,j = np.mgrid[:img.shape[0], :img.shape[1]] # 2d arrays each
T = np.array([[1, -1],[1, 1]])/np.sqrt(2)
inew,jnew = T @ [i.ravel(), j.ravel()] # 1d arrays each

# new coordinates now range into negatives, shift back into positives
# and the non-integer pixel indices will be normalized with floor
inew = np.floor(inew - inew.min()).astype(int)
jnew = np.floor(jnew - jnew.min()).astype(int)

# now the new coordinates are all non-negative, this defines the size of the output
out = np.zeros((inew.max() + 1, jnew.max() + 1, 3), dtype=img.dtype)

# fill the necessary indices of out with pixels from img
# reshape the indices to 2d for matching broadcast
inew = inew.reshape(img.shape[:-1])
jnew = jnew.reshape(img.shape[:-1])
out[inew, jnew, :] = img
# OR, alternative with 1d index arrays:
#out[inew, jnew, :] = img.reshape(-1, 3)

# check what we've done
plt.imshow(out)
plt.show()

rotated raccoon

The gist of the code is that the rotated pixel coordinates are shifted back into the positives (this corresponds to your [i+a, j+b] shift), a new zero array is allocated that will fit all the new indices, and indexing is applied only on the right-hand-side! This is what doesn't match your code, but I believe this is what you really want to do: for each pixel in the original (unindexed) image we set its RGB value in the new position of the resulting array.

As you can see, there are a lot of black pixels in the image due to the fact that the non-integer transformed coordinates were rounded with floor. This is not nice, so if we pursue this path we should perform 2d interpolation in order to get rid of these artifacts. Note that this needs quite a bit of memory and CPU time:

import numpy as np
import scipy.interpolate as interp
import matplotlib.pyplot as plt # for plotting the result
from scipy.misc import face # for dummy data
img = face() # dummy RGB data

# transform pixels by 45 degrees
i,j = np.mgrid[:img.shape[0], :img.shape[1]] # 2d arrays each
T = np.array([[1, -1],[1, 1]])/np.sqrt(2)
inew,jnew = T @ [i.ravel(), j.ravel()] # 1d arrays each

# new coordinates now range into negatives, shift back into positives
# keep them non-integer for interpolation later
inew -= inew.min()
jnew -= jnew.min()
# (inew, jnew, img) contain the data from which the output should be interpolated


# now the new coordinates are all non-negative, this defines the size of the output
out = np.zeros((int(round(inew.max())) + 1, int(round(jnew.max())) + 1, 3), dtype=img.dtype)
i_interp,j_interp = np.mgrid[:out.shape[0], :out.shape[1]]

# interpolate for each channel
for channel in range(3):
    out[..., channel] = interp.griddata(np.array([inew.ravel(), jnew.ravel()]).T, img[..., channel].ravel(), (i_interp, j_interp), fill_value=0)

# check what we've done
plt.imshow(out)
plt.show()

At least the result looks much better:

interpolated version with griddata

scipy.ndimage: map_coordinates

An approach that is directly along what you had in mind can make use of scipy.ndimage.map_coordinates to perform interpolation using the inverse transformation. This should have better performance than the previous attempt with griddata, since map_coordinates can make use of the fact that the input data is defined on a grid. It turns out that it indeed uses both less memory and much less CPU:

import numpy as np
import scipy.ndimage as ndi
import matplotlib.pyplot as plt # for plotting the result
from scipy.misc import face # for dummy data

img = face() # dummy RGB data
n,m = img.shape[:-1]

# transform pixels by 45 degrees
T = np.array([[1, -1],[1, 1]])/np.sqrt(2)

# find out the extent of the transformed pixels from the four corners
inew_tmp,jnew_tmp = T @ [[0, 0, n-1, n-1], [0, m-1, 0, m-1]] # 1d arrays each
imin,imax,jmin,jmax = inew_tmp.min(),inew_tmp.max(),jnew_tmp.min(),jnew_tmp.max()
imin,imax,jmin,jmax = (int(round(val)) for val in (imin,imax,jmin,jmax))

# so the pixels of the original map inside [imin, imax] x [jmin, jmax]
# we need an image of size (imax - imin + 1, jmax - jmin + 1) to house this
out = np.zeros((imax - imin + 1, jmax - jmin + 1, 3), dtype=img.dtype)
# indices have to be shifted by [imin, imax]

# compute the corresponding (non-integer) coordinates on the domain for interpolation
inew,jnew = np.mgrid[:out.shape[0], :out.shape[1]]
i_back,j_back = np.linalg.inv(T) @ [inew.ravel() + imin, jnew.ravel() + jmin]

# perform 2d interpolation for each colour channel separately
for channel in range(3):
    out[inew, jnew, channel] = ndi.map_coordinates(img[..., channel], [i_back, j_back]).reshape(inew.shape)

# check what we've done
plt.imshow(out)
plt.show()

The result is still nice:

final interpolated version with map_coordinates

scipy.ndimage: geometric_transform

Finally, I realized that we can go one level higher and use scipy.ndimage.geometric_transform directly. For the rotated raccoon case this seems to be slower than the manual version using map_coordinates, but leads to cleaner code:

import numpy as np
import scipy.ndimage as ndi
import matplotlib.pyplot as plt # for plotting the result
from scipy.misc import face # for dummy data

img = face() # dummy RGB data
n,m = img.shape[:-1]

# transform pixels by 45 degrees
T = np.array([[1, -1],[1, 1]])/np.sqrt(2)
Tinv = np.linalg.inv(T)

# find out the extent of the transformed pixels from the four corners
inew_tmp,jnew_tmp = T @ [[0, 0, n-1, n-1], [0, m-1, 0, m-1]] # 1d arrays each
imin,imax,jmin,jmax = inew_tmp.min(),inew_tmp.max(),jnew_tmp.min(),jnew_tmp.max()
imin,imax,jmin,jmax = (int(round(val)) for val in (imin,imax,jmin,jmax))

# so the pixels of the original map inside [imin, imax] x [jmin, jmax]
# we need an image of size (imax - imin + 1, jmax - jmin + 1) to house this

def transform_func(output_coords):
    """Inverse transform output coordinates back into input coordinates"""
    inew,jnew,channel = output_coords
    i,j = Tinv @ [inew + imin, jnew + jmin]
    return i,j,channel

out = ndi.geometric_transform(img, transform_func, output_shape = (imax - imin + 1, jmax - jmin + 1, 3))

# check what we've done
plt.imshow(out)
plt.show()

Result:

result using geometric_transform

Final fix: only numpy

I was primarily concerned with image quality, so all of the above solutions use interpolation in one way or the other. As you explained in comments, this is not of primary concern to you. If this is the case we can modify the version using map_coordinates and calculate approximate (rounded integer) indices ourselves and perform the vectorized assignment:

import numpy as np
import matplotlib.pyplot as plt # for plotting the result
from scipy.misc import face # for dummy data

img = face() # dummy RGB data
n,m = img.shape[:-1]

# transform pixels by 45 degrees
T = np.array([[1, -1],[1, 1]])/np.sqrt(2)

# find out the extent of the transformed pixels from the four corners
inew_tmp,jnew_tmp = T @ [[0, 0, n-1, n-1], [0, m-1, 0, m-1]] # 1d arrays each
imin,imax,jmin,jmax = inew_tmp.min(),inew_tmp.max(),jnew_tmp.min(),jnew_tmp.max()
imin,imax,jmin,jmax = (int(round(val)) for val in (imin,imax,jmin,jmax))

# so the pixels of the original map inside [imin, imax] x [jmin, jmax]
# we need an image of size (imax - imin + 1, jmax - jmin + 1) to house this
out = np.zeros((imax - imin + 1, jmax - jmin + 1, 3), dtype=img.dtype)

# compute the corresponding coordinates on the domain for matching
inew,jnew = np.mgrid[:out.shape[0], :out.shape[1]]
inew = inew.ravel() # 1d array, indices of output array
jnew = jnew.ravel() # 1d array, indices of output array
i_back,j_back = np.linalg.inv(T) @ [inew + imin, jnew + jmin]

# create a mask to grab only those rounded (i_back,j_back) indices which make sense
i_back = i_back.round().astype(int)
j_back = j_back.round().astype(int)
inds = (0 <= i_back) & (i_back < n) & (0 <= j_back) & (j_back < m)
# (i_back[inds], j_back[inds]) maps to (inew[inds], jnew[inds])
# the rest stays black

out[inew[inds], jnew[inds], :] = img[i_back[inds], j_back[inds], :]

# check what we've done
plt.imshow(out)
plt.show()

The result, while full of single-pixel inaccuracies, looks good enough:

result of the version without interpolation

  • Your transformation is in the forward direction. My transformation is in the backward direction. Let's say I have two images: Image 1 and Image 2. I have a homography projection that changes the 3D perspective of Image 1 to something that looks like Image 2. Now, if I apply this transformation to Image 1, I will get an image which has a lot of undefined pixels due to rounding. One way to overcome this is to create a black image first. Perform an inverse transformation on it and then pick the corresponding pixel. Do you know what I mean? – stressed out Nov 01 '18 at 12:08
  • @stressedout I don't think I understand. If the transformation is in the opposite direction you just have to switch the arrays, or switch the side where indexing happens. If this still doesn't help I suggest putting a proper [mcve] in your question. – Andras Deak -- Слава Україні Nov 01 '18 at 12:27
  • I don't mind sharing my code, but it's 213 lines. I don't know how to do that here. The transformation is not in the opposite direction. Let me explain it better. Suppose H is a transformation that sends Img1 to a synthesized version of Img2. So, H(Img1) ~ Img2. If you apply this transformation, due to truncation, you will get some black pixels in your image. Just like your example. Do you see the black dots in your image? To overcome this, you will first create a black photo B, then H^{-1}(B) ~ Im1. You won't get black pixels now because all pixels will be covered. Do you see the difference? – stressed out Nov 01 '18 at 12:36
  • @stressedout no, not really. I see the black pixels and started expanding my answer about that, then I saw your comments. I meant adding some code _in your question_. The reason I had to guess is that your question is incomplete. Can you show what you want to happen with a simple transformation? If you use the raccoon rotated with 45 degrees we can compare exactly. – Andras Deak -- Слава Україні Nov 01 '18 at 12:38
  • 1
    Let me modify my question based on your code. I will modify your code to tell you what I intend to do. I'll edit my question now. – stressed out Nov 01 '18 at 12:39
  • I edited my question. Please check and see if my question is clear now. Thank you for your patience. – stressed out Nov 01 '18 at 12:52
  • @stressedout I added two interpolated versions. The first one fixes my original approach, uses `scipy.interpolate.griddata`, and needs a lot of RAM and CPU. The second approach is along the lines of what you're actually trying to do, uses `scipy.ndimage.map_coordinates`, and is much more efficient. I wouldn't be surprised if we could do even less work using [`geometric_transform`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.geometric_transform.html), but I'd have to think about that. – Andras Deak -- Слава Україні Nov 01 '18 at 16:59
  • @stressedout OK, I added a solution using `geometric_transform`, but the `map_coordinates` version seems more efficient. – Andras Deak -- Слава Україні Nov 01 '18 at 17:15
  • Thanks. I didn't want to use interpolation (after forward transformation) because it's inefficient. However, your other solutions are nice. Is it possible to do this without resorting to scipy? or with just built-in data structures in python and numpy? I'm sure there must be a way to do this without packages, but I can't figure it out. :( It's been driving me nuts since yesterday. – stressed out Nov 01 '18 at 17:57
  • Also, strangely enough, the final image is vertically flipped using this approach. I don't understand what has gone wrong. Being vertically flipped means that the x-coordinate is minus its original value. But I can't figure why it's so. It's strange. – stressed out Nov 01 '18 at 18:03
  • @stressedout I added a version that doesn't interpolate, and uses rounded pixels instead. This is not as accurate as interpolation, but this seems to be what you're looking for. As for being flipped: are you sure you're not just seeing how `pyplot.imshow` shows matrices? In order to show images "the right side up" plots created with `imshow` have pixel indices growing from left to right and from top to bottom (unlike usual plots where y grows from bottom to top). – Andras Deak -- Слава Україні Nov 01 '18 at 18:58
  • No, I have checked it twice. I can't figure why it's flipped. :( I have used Pillow and OpenCV to read and write the image. Every time the final result is flipped. I thought it was because of negative pixels and that's why I asked this question. But I'm still getting the same flipped image. :( It's driving me crazy. – stressed out Nov 02 '18 at 08:22
  • @stressedout that can only happen if your transformation contains a spurious negative sign. Try transforming four corners of a rectangle and plotting each corner with a separate colour with something like `pyplot.scatter`. – Andras Deak -- Слава Україні Nov 02 '18 at 10:45
  • I agree, but there's no negative index. I checked it. I did a np.where() and it showed nothing. I'm sure it's not an issue with negative indices and I can't figure what is happening really. – stressed out Nov 02 '18 at 11:06
  • @stressedout No, not negative. Reversed. Don't assume: test. Draw the four corners before and after transformation. That's the only way to be sure. – Andras Deak -- Слава Україні Nov 02 '18 at 11:10
  • What do you mean by 'reversed'? Doesn't it mean that the x-axis indices become negative? Or do you mean something else? Note that I'm talking about the indices, not their corresponding values. – stressed out Nov 02 '18 at 11:50
  • @stressedout I mean doing `n,m = img.shape[:-1]; points = [[0, 0, n-1, n-1], [0, m-1, 0, m-1]]; plt.scatter(*points, c=list('rgbm')); plt.show()` and the same using your _transformed_ `points`. If after the transformation the order of colours is reversed you'll know that it's your transformation that's flipping the image. Negative indices would only _shift_ the image around cyclically, it wouldn't be able to cause a flip. – Andras Deak -- Слава Україні Nov 02 '18 at 12:07
  • 1
    After two days, I finally figured that the issue with my code was that coordinates in MATLAB and python are different. (x,y) in MATLAB is equivalent to [y,x] in python!!! Now my code works fine and I tested it on 5 photos with different homography maps. I remember I had the same problem with OpenCV where the default channel order is BGR instead of RGB and it took me almost an hour to figure this. – stressed out Nov 03 '18 at 12:10
  • 1
    Yeah, I knew that about MATLAB (and fortran). Glad you figured it out, @stressedout. By the way this is row-major vs column-major layout exactly. – Andras Deak -- Слава Україні Nov 03 '18 at 12:45
  • 1
    Yeah. Or C-continguous vs. Fortran-continguous as you mentioned it in a different comment. I will never ever forget this difference after all the time I spent in vain to find the error with my perfectly fine algorithm. Thank you so much for your patience and your help. – stressed out Nov 03 '18 at 13:25
0

You can use a map function

for i in range(0,J.shape[0]):
    for j in range(0, J.shape[1]):
        K[i+a,j+b] = J[T[i,j]]

for example you can generate all the tuples of indexes of your matrix

indexes = [ (i,j) for i in range(J.shape[0]) for j in range(J.shape[1]) ]

and then apply a map with a lambda function

f = lambda coords:  J[T[coords[0],coords[1]]]
resp = list(map(f, indexes))

at this point resp contains a list of all the applications of f to the indexes. Now you have to reshape it into the good shape. for K

So here you have two possibilities, you can make the list of ranges the size of K, and then you can return zero whenever it is necessary inside the lambda function

Old answer...

The problem here is that you have to know the size of the ouput image beforehand. So there are two possibilities, either you compute it or you assume that it wont be larger than a certain estimate.

So if you compute it, the way to go depends on the transformation you want to apply. For example, transpose means a swap of X and Y axis lengths. For rotation the size of the result depends on the shape and the angle.

So

if you want to keep it very very simple but not necessarily memory friendly. assume that your transformation wont output an image larger than trice the max of X and Y lengths.

In you do this, you can easily handle your offsets

if your image is NxM with N > M, the canvas for your transformation will be 3*Nx3*N

now lets say that the output image will be centered in this canvas. In this situation you have to compute the a and b offsets that you describe in your question

The center of the transformed image along the vertical axis should match the center of the original image.

if i=N/2 then a+i=3*N/2 this implies that a=N

the same applies to the horizontal axis and in that case

if j=M/2 then b+j=3*N/2 this implies that b=(3*N - M)/2

I hope it is clear

Community
  • 1
  • 1
Gabriel M
  • 1,486
  • 4
  • 17
  • 25
  • The question is not about determining the size of output or the offsets. I have already determined those precisely. The problem is to do the assignment in a vectorized way without for-loops. – stressed out Oct 31 '18 at 16:25
  • have you tried a map approach? something like passing the permutations of coordinates through a lambda function `map( lambda (i,j): i+j, zip(range(5), range(5)) )` – Gabriel M Oct 31 '18 at 16:29
  • No, I don't know how to do that. I'm not familiar with map and lambda in python. (I'm going to read about them now that I know they're relevant). I'd appreciate it if you explained it in your answer. – stressed out Oct 31 '18 at 16:30
  • I don't understand. :( Where does shifting x by a and y by b come into play? – stressed out Nov 01 '18 at 11:00