3

I'm struggling with a specific computer vision task right now. Imagine we have a camera frame of a road for example. Now I want to generate a new frame with the imaginary camera translated horizontally. In addition, there's also a tiny camera angle added. To illustrate this, I uploaded a demonstration image:

Demonstration

How can I create the new frame out of the original one in python? For my other computer vision tasks I was using OpenCV already.

johni07
  • 761
  • 1
  • 11
  • 31
  • 1
    Suppose you take two pictures as you show but without the `x` translation a meter from a planar surface. You can imagine that even though you'll get some perspective changes, you'll also be getting more image from the right and less from the left, so there will be some translation involved. Now imagine you do the same, only one kilometer from the wall. Way more translation from the same angle. So you need more information; depth in particular. [Here's a very detailed slide](http://6.869.csail.mit.edu/fa12/lectures/lecture13ransac/lecture13ransac.pdf) which covers many aspects of this problem. – alkasm Aug 22 '17 at 07:46

2 Answers2

5

I was struggling with this for a while too until I saw this helpful post that shares some example codes. I understood in theory that you can get the new frame using OpenCV's warpPerspective function if you have the homography matrix. Since you have the exact translation and rotation values, you can derive the matrix yourself given the camera's intrinsic parameters. However it wasn't until I tried out with codes myself that I fully see how it's done.

We know that for a 3D point in space to a 2D image projection the homography matrix is given by

H = K[R|T]

To transform the points from one 2D image to another you simply have to inverse project the points to 3D first and then reproject them to the new image plane.

x’ = K * [R2|T2] * [R1|T1](inv) * K(inv) * x

The [R2|T2] * [R1|T1](inv) equates to a single transformation matrix that gives the relative transformation from one camera pose to another. All the matrices are shaped as 4x4 by appending with [0, 0, 0, 1] wherever needed.

Here's some sample codes which are adapted from the codes from the same post.

import cv2
import numpy as np

f = 500
rotXval = 90
rotYval = 90
rotZval = 90
distXval = 500
distYval = 500
distZval = 500

def onFchange(val):
    global f
    f = val
def onRotXChange(val):
    global rotXval
    rotXval = val
def onRotYChange(val):
    global rotYval
    rotYval = val
def onRotZChange(val):
    global rotZval
    rotZval = val
def onDistXChange(val):
    global distXval
    distXval = val
def onDistYChange(val):
    global distYval
    distYval = val
def onDistZChange(val):
    global distZval
    distZval = val

if __name__ == '__main__':

    #Read input image, and create output image
    src = cv2.imread('test.jpg')
    src = cv2.resize(src,(640,480))
    dst = np.zeros_like(src)
    h, w = src.shape[:2]

    #Create user interface with trackbars that will allow to modify the parameters of the transformation
    wndname1 = "Source:"
    wndname2 = "WarpPerspective: "
    cv2.namedWindow(wndname1, 1)
    cv2.namedWindow(wndname2, 1)
    cv2.createTrackbar("f", wndname2, f, 1000, onFchange)
    cv2.createTrackbar("Rotation X", wndname2, rotXval, 180, onRotXChange)
    cv2.createTrackbar("Rotation Y", wndname2, rotYval, 180, onRotYChange)
    cv2.createTrackbar("Rotation Z", wndname2, rotZval, 180, onRotZChange)
    cv2.createTrackbar("Distance X", wndname2, distXval, 1000, onDistXChange)
    cv2.createTrackbar("Distance Y", wndname2, distYval, 1000, onDistYChange)
    cv2.createTrackbar("Distance Z", wndname2, distZval, 1000, onDistZChange)

    #Show original image
    cv2.imshow(wndname1, src)

    k = -1
    while k != 27:

        if f <= 0: f = 1
        rotX = (rotXval - 90)*np.pi/180
        rotY = (rotYval - 90)*np.pi/180
        rotZ = (rotZval - 90)*np.pi/180
        distX = distXval - 500
        distY = distYval - 500
        distZ = distZval - 500

        # Camera intrinsic matrix
        K = np.array([[f, 0, w/2, 0],
                    [0, f, h/2, 0],
                    [0, 0,   1, 0]])

        # K inverse
        Kinv = np.zeros((4,3))
        Kinv[:3,:3] = np.linalg.inv(K[:3,:3])*f
        Kinv[-1,:] = [0, 0, 1]

        # Rotation matrices around the X,Y,Z axis
        RX = np.array([[1,           0,            0, 0],
                    [0,np.cos(rotX),-np.sin(rotX), 0],
                    [0,np.sin(rotX),np.cos(rotX) , 0],
                    [0,           0,            0, 1]])

        RY = np.array([[ np.cos(rotY), 0, np.sin(rotY), 0],
                    [            0, 1,            0, 0],
                    [ -np.sin(rotY), 0, np.cos(rotY), 0],
                    [            0, 0,            0, 1]])

        RZ = np.array([[ np.cos(rotZ), -np.sin(rotZ), 0, 0],
                    [ np.sin(rotZ), np.cos(rotZ), 0, 0],
                    [            0,            0, 1, 0],
                    [            0,            0, 0, 1]])

        # Composed rotation matrix with (RX,RY,RZ)
        R = np.linalg.multi_dot([ RX , RY , RZ ])

        # Translation matrix
        T = np.array([[1,0,0,distX],
                    [0,1,0,distY],
                    [0,0,1,distZ],
                    [0,0,0,1]])

        # Overall homography matrix
        H = np.linalg.multi_dot([K, R, T, Kinv])

        # Apply matrix transformation
        cv2.warpPerspective(src, H, (w, h), dst, cv2.INTER_NEAREST, cv2.BORDER_CONSTANT, 0)

        # Show the image
        cv2.imshow(wndname2, dst)
        k = cv2.waitKey(1)
emilyfy
  • 183
  • 1
  • 8
1

If you are trying to translate, change the plane of the image, that can be done with Homography matrix. Check about perspective transform. Here .

enter image description here

You need to play with the values H(0,2) and H(2,0) of the matrix to translate along X and then change the image to an angle, like in your image.

First find the Homography matrix with the same image and then change the above position values of the matrix and warp it. You will get as you wanted.

Edit: Homography is simply a 3x3 matrix. Each matrix element corresponds to a specific manipulation on the image.

Like The element in 0x0 position stretches the image horizontally. The element at 1x0 position skew'es the image. Like keeping the left edge still and pulling down the right edge. Like wise, other elements do their respective operations.

Now in the homography matrix, the elements at 2x0 and 0x2 are assigned for the task you wanted. ie, Shifting the plane and Moving along the X direction. By changing (playing) with those values, you get different perspectives of the image. This is hence also called perspective transform.

Community
  • 1
  • 1
I.Newton
  • 1,753
  • 1
  • 10
  • 14
  • Hi this sounds quite easy, but what do you mean with 'play with the values H(0,2) and H(2,0)'? Sorry I'm quite new to this topic – johni07 Aug 22 '17 at 13:13
  • Ok, i edited and added in the last part of the answer. check. If you need help with the code, feel free to ask. – I.Newton Aug 22 '17 at 16:16