6

I have a rotation-translation matrix [R T] (3x4).

Is there a function in opencv that performs the rotation-translation described by [R T]?

Carlo Pane
  • 193
  • 2
  • 12
  • yes but you also need your intrinsic parameters, do you have those? You will also need to make some assumption about the initial position of the image but usually that isn't an issue. – Hammer Oct 27 '12 at 17:10
  • Yes, I have the intrinsic parameters. – Carlo Pane Oct 27 '12 at 20:16
  • 1
    Next question, what are your R and T matrices relative to? The cameras original position? Some location in world space? ect. – Hammer Oct 27 '12 at 20:23
  • 1
    A location in world space, specifically, the floor. – Carlo Pane Oct 27 '12 at 21:24
  • 1
    Ok, this should be the last question. You know where the camera is now, that is R, and T. If you want to warp an image of the floor from one perspective to another, you need to know what the R, and T of the camera was when the first image was taken. Then you can figure out the R and T between them ect. Do you know R and T when the image was taken? – Hammer Oct 27 '12 at 22:03
  • Yes, I know R and T in each perspective. – Carlo Pane Oct 27 '12 at 23:52

1 Answers1

11

A lot of solutions to this question I think make hidden assumptions. I will try to give you a quick summary of how I think about this problem (I have had to think about it a lot in the past). Warping between two images is a 2 dimensional process accomplished by a 3x3 matrix called a homography. What you have is a 3x4 matrix which defines a transform in 3 dimensions. You can convert between the two by treating your image as a flat plane in 3 dimensional space. The trick then is to decide on the initial position in world space of your image plane. You can then transform its position and project it onto a new image plane with your camera intrinsics matrix.

The first step is to decide where your initial image lies in world space, note that this does not have to be the same as your initial R and T matrices specify. Those are in world coordinates, we are talking about the image created by that world, all the objects in the image have been flattened into a plane. The simplest decision here is to set the image at a fixed displacement on the z axis and no rotation. From this point on I will assume no rotation. If you would like to see the general case I can provide it but it is slightly more complicated.

Next you define the transform between your two images in 3d space. Since you have both transforms with respect to the same origin, the transform from [A] to [B] is the same as the transform from [A] to your origin, followed by the transform from the origin to [B]. You can get that by

transform = [B]*inverse([A])

Now conceptually what you need to do is to take your first image, project its pixels onto the geometric interpretation of your image in 3d space, then transform those pixels in 3d space by the transform above, then project them back onto a new 2d image with your camera matrix. Those steps need to be combined into a single 3x3 matrix.

cv::Matx33f convert_3x4_to_3x3(cv::Matx34f pose, cv::Matx33f camera_mat, float zpos)
{   
//converted condenses the 3x4 matrix which transforms a point in world space 
//to a 3x3 matrix which transforms a point in world space.  Instead of 
//multiplying pose by a 4x1 3d homogeneous vector, by specifying that the
//incoming 3d vectors will ALWAYS have a z coordinate of zpos, one can instead 
//multiply converted by a homogeneous 2d vector and get the same output for x and y.

cv::Matx33f converted(pose(0,0),pose(0,1),pose(0,2)*zpos+pose(0,3),
                      pose(1,0),pose(1,1),pose(1,2)*zpos+pose(1,3),
                      pose(2,0),pose(2,1),pose(2,2)*zpos+pose(2,3));

//This matrix will take a homogeneous 2d coordinate and "projects" it onto a 
//flat plane at zpos.  The x and y components of the incoming homogeneous 2d 
//coordinate will be correct, the z component is dropped.  
cv::Matx33f projected(1,0,0,
                      0,1,0,
                      0,0,zpos);
projected = projected*camera_mat.inv();

//now we have the pieces.  A matrix which can take an incoming 2d point, and 
//convert it into a pseudo 3d point (x and y correspond to 3d, z is unused) 
//and a matrix which can take our pseudo 3d point and transform it correctly.  
//Now we just need to turn our transformed pseudo 3d point back into a 2d point 
//in our new image, to do that simply multiply by the camera matrix.

return camera_mat*converted*projected;
}

This is probably a more complicated answer than you were looking for but I hope it gives you an idea of what you are asking. This can be very confusing and I glazed over some parts of it quickly, feel free to ask for clarification. If you need the solution to work without the assumption that the initial image appears without rotation let me know, I just didn't want to make it more complicated than it needed to be.

Hammer
  • 10,109
  • 1
  • 36
  • 52
  • I've been trying to follow your answer as I'm trying to do something similar, Perhaps you could help. The inputs to your function pose and camera_mat? What are these exactly? Carlo had mentioned that he has intrinsic and extrinsic matrices for both cameras (so do I) How do you map an (x,y) point from View 1 to the corresponding (x,y) point in View 2? Would you just multiply the 3x3 matrix returned from your function to the (x,y) point in view 1? – Luke Zammit Apr 02 '15 at 14:46
  • 1
    @LukeZammit To convert the point you take your point (x,y) convert it into a 3 vector [x,y,1] and multiply by the 3x3 matrix. Camera_mat is a camera intrinsics matrix. You can find explanations of how the intrinsics matrix works online but conceptually it represents the focal length of a camera. Pose is the transform you want to apply to your image in '3d space' Let me know if you still have questions – Hammer Apr 18 '15 at 01:18