Extracting 3D coordinates given 2D image points, depth map and camera calibration matrices

Question

I have a set of 2D image keypoints that are outputted from the OpenCV FAST corner detection function. Using an Asus Xtion I also have a time-synchronised depth map with all camera calibration parameters known. Using this information I would like to extract a set of 3D coordinates (point cloud) in OpenCV.

Can anyone give me any pointers regarding how to do so? Thanks in advance!

undo the projection. If there is no error in my thoughts, the projection should be `p = C*T*P` where C is camera intrinsics, T is camera extrinsics, P is 3D point. So do something like `P = iT * iC * p` where iC is inverse intrinsics, iT is inverse extrinsics. p must be extended to be 3D point, where last coordinate is 1 and the whole point will be multiplied by your depth. — Micka, Jul 07 '15 at 10:07
if that doesnt work you can create a ray from (0,0,0) through the pixel (given by iC*p). You can transform that ray to be in global Space by multiplying by T. Then move along that ray for a distance of your depth value. — Micka, Jul 07 '15 at 10:09
or just have a look at http://nicolas.burrus.name/index.php/Research/KinectCalibration and do the same — Micka, Jul 07 '15 at 10:10
Thanks! the link you posted is exactly what I needed. If you post your comments as an answer I can accept it. — Will Andrew, Jul 07 '15 at 10:32

Micka · Accepted Answer · 2020-07-08T13:25:39.013

Nicolas Burrus has created a great tutorial for Depth Sensors like Kinect.

http://nicolas.burrus.name/index.php/Research/KinectCalibration

I'll copy & paste the most important parts:

Mapping depth pixels with color pixels

The first step is to undistort rgb and depth images using the estimated distortion coefficients. Then, using the depth camera intrinsics, each pixel (x_d,y_d) of the depth camera can be projected to metric 3D space using the following formula:
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
with fx_d, fy_d, cx_d and cy_d the intrinsics of the depth camera.

If you are further interested in stereo mapping (values for kinect):

We can then reproject each 3D point on the color image and get its color:
P3D' = R.P3D + T 
P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb
P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb
with R and T the rotation and translation parameters estimated during the stereo calibration.

The parameters I could estimate for my Kinect are:

Color
fx_rgb 5.2921508098293293e+02 

fy_rgb 5.2556393630057437e+02 
cx_rgb 3.2894272028759258e+02 
cy_rgb 2.6748068171871557e+02 
k1_rgb 2.6451622333009589e-01 
k2_rgb -8.3990749424620825e-01 
p1_rgb -1.9922302173693159e-03 
p2_rgb 1.4371995932897616e-03 
k3_rgb 9.1192465078713847e-01

Depth

fx_d 5.9421434211923247e+02

fy_d 5.9104053696870778e+02 
cx_d 3.3930780975300314e+02 
cy_d 2.4273913761751615e+02 
k1_d -2.6386489753128833e-01 
k2_d 9.9966832163729757e-01 
p1_d -7.6275862143610667e-04 
p2_d 5.0350940090814270e-03 
k3_d -1.3053628089976321e+00

Relative transform between the sensors (in meters)
R [ 9.9984628826577793e-01, 1.2635359098409581e-03, -1.7487233004436643e-02, 

-1.4779096108364480e-03, 9.9992385683542895e-01, -1.2251380107679535e-02,

1.7470421412464927e-02, 1.2275341476520762e-02, 9.9977202419716948e-01 ]

T [ 1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ]

I would add that `P2D_rgb.x` and `P2D_rgb.y` are indices into the RGB image (starting from (0, 0) ). If an index is out of bounds, it means that that pixel is not visible in the RGB image. — Maghoumi, Feb 20 '16 at 22:42

Extracting 3D coordinates given 2D image points, depth map and camera calibration matrices

1 Answers1

Linked