1

The goal is to get the Bird's Eye View from KITTI images (dataset), and I have the Projection Matrix (3x4).

There are many ways to generate transformation matrices. For Bird's Eye View I have read some kind math expressions, like:

H12 = H2*H1-1=ARA-1=P*A-1 in OpenCV - Projection, homography matrix and bird's eye view

and x = Pi * Tr * X in kitti dataset camera projection matrix

but none of these options worked for my purpose.

PYTHON CODE

import numpy as np import cv2

image = cv2.imread('Data/RGB/000007.png')

maxHeight, maxWidth = image.shape[:2]

M has 3x4 dimensions

M = np.array(([721.5377, 0.0, 609.5593, 44.85728], [0.0, 721.5377, 72.854, 0.2163791], [0.0, 0.0, 1.0, .002745884]))

Here It's necessary a M matrix with 3x3 dimensions

warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

show the original and warped images

cv2.imshow("Original", image)

cv2.imshow("Warped", warped)

cv2.waitKey(0)

I need to know how to manage the Projection Matrix for getting Bird's Eye View.

So far, everything I've tried throws warped images at me, without information even close to what I need.

This is a example of image from the KITTI database.

This is other example of image from the KITTI database.

On the left, images are shown detecting cars in 3D (above) and 2D (below). On the right is the Bird's Eye View that I want to obtain. Therefore, I need to obtain the transformation matrix to transform the coordinates of the boxes that delimit the cars.

VíctorV
  • 11
  • 3
  • 1
    can you show one of those kitti bird's eye view images? To get an idea of what you want ro achive – Micka Nov 02 '19 at 06:01
  • I have already added a couple of images in the post. – VíctorV Nov 03 '19 at 19:26
  • you want to get the image on the right? Just define 4 points on the ground plane of your camera images and corresponding 4 points on the bird's eye image and getPerspectiveTransform to get the conversion. – Micka Nov 03 '19 at 21:42
  • if you have the boxes in 3d coordinates, just remove the z coordinate to get an orthogonal projection. Afterwards scale and translate the space to your desired image space. – Micka Nov 03 '19 at 21:45
  • I do not want to use the 4-point method because it is miles of frames that I must process, each frame offers its own Projection Matrix values. That's why I want to use that matrix, I guess that way I will get more accurate results. – VíctorV Nov 04 '19 at 00:10

1 Answers1

2

Here is my code to manually build a bird's eye view transform:

cv::Mat1d CameraModel::getInversePerspectiveMapping(double pixelPerMeter, cv::Point const & origin) const {
    double f = pixelPerMeter * cameraPosition()[2];
    cv::Mat1d R(3,3);
    R <<  0, 1, 0,
          1, 0, 0,
          0, 0, 1;

    cv::Mat1d K(3,3);
    K << f, 0, origin.x, 
         0, f, origin.y, 
         0, 0, 1;
    cv::Mat1d transformtoGround = K * R * mCameraToCarMatrix(cv::Range(0,3), cv::Range(0,3));
    return transformtoGround * mIntrinsicMatrix.inv();
}

The member variables/functions used inside the functions are

  • mCameraToCarMatrix: a 4x4 matrix holding the homogeneous rigid transformation from the camera's coordinate system to the car's coordinate system. The camera's axes are x-right, y-down, z-forward. The car's axes are x-forward, y-left, z-up. Within this function only the rotation part of mCameraToCarMatrix is used.
  • mIntrinsicMatrix: the 3x3 matrix holding the camera's intrinsic parameters
  • cameraPosition()[2]: the Z-coordinate (height) of the camera in the car's coordinate frame. It's the same as mCameraToCarMatrix(2,3).

The function parameters:

  • pixelPerMeter: the resolution of the bird's eye view image. A distance of 1 meter on the XY plane will translate to pixelPerMeter pixels in the bird's eye view image.
  • origin: the camera's position in the bird's eye view image

You can pass the transform matrix to cv::initUndistortRectifyMaps() as newCameraMatrix and then use cv::remap to create the bird's eye view image.

Tobias
  • 229
  • 1
  • 5
  • Is this code used with the images of the KITTI database? I understand the use of the matrix R (rotation) and K (intrisic), but do you know what would be the relationship between the rest of the variables and the projection matrix offered by KITTI? – VíctorV Nov 03 '19 at 19:31
  • It's my source code from a different application. The projection matrix is a product of the intrinsic matrix and the extrinsic transformations from car to camera frame. Within my code I would compute P as `P = mIntrinsicMatrix * mCameraToCarMatrix.inv()(cv::Range(0,3), cv::Range(0,4));` I'm not sure if you can reverse that. – Tobias Nov 03 '19 at 19:49
  • The problem is that my projection matrix is ​​3x4, I don't know what I need to get a 3x3 transformation matrix. – VíctorV Nov 04 '19 at 01:40