13

I have a set of 2D image keypoints that are outputted from the OpenCV FAST corner detection function. Using an Asus Xtion I also have a time-synchronised depth map with all camera calibration parameters known. Using this information I would like to extract a set of 3D coordinates (point cloud) in OpenCV.

Can anyone give me any pointers regarding how to do so? Thanks in advance!

Will Andrew
  • 693
  • 1
  • 10
  • 29
  • 1
    undo the projection. If there is no error in my thoughts, the projection should be `p = C*T*P` where C is camera intrinsics, T is camera extrinsics, P is 3D point. So do something like `P = iT * iC * p` where iC is inverse intrinsics, iT is inverse extrinsics. p must be extended to be 3D point, where last coordinate is 1 and the whole point will be multiplied by your depth. – Micka Jul 07 '15 at 10:07
  • 1
    if that doesnt work you can create a ray from (0,0,0) through the pixel (given by iC*p). You can transform that ray to be in global Space by multiplying by T. Then move along that ray for a distance of your depth value. – Micka Jul 07 '15 at 10:09
  • 1
    or just have a look at http://nicolas.burrus.name/index.php/Research/KinectCalibration and do the same – Micka Jul 07 '15 at 10:10
  • 1
    Thanks! the link you posted is exactly what I needed. If you post your comments as an answer I can accept it. – Will Andrew Jul 07 '15 at 10:32

1 Answers1

24

Nicolas Burrus has created a great tutorial for Depth Sensors like Kinect.

http://nicolas.burrus.name/index.php/Research/KinectCalibration

I'll copy & paste the most important parts:

Mapping depth pixels with color pixels

The first step is to undistort rgb and depth images using the estimated distortion coefficients. Then, using the depth camera intrinsics, each pixel (x_d,y_d) of the depth camera can be projected to metric 3D space using the following formula:

P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)

with fx_d, fy_d, cx_d and cy_d the intrinsics of the depth camera.

If you are further interested in stereo mapping (values for kinect):

We can then reproject each 3D point on the color image and get its color:

P3D' = R.P3D + T 
P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb
P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb

with R and T the rotation and translation parameters estimated during the stereo calibration.

The parameters I could estimate for my Kinect are:

Color

fx_rgb 5.2921508098293293e+02 
fy_rgb 5.2556393630057437e+02 
cx_rgb 3.2894272028759258e+02 
cy_rgb 2.6748068171871557e+02 
k1_rgb 2.6451622333009589e-01 
k2_rgb -8.3990749424620825e-01 
p1_rgb -1.9922302173693159e-03 
p2_rgb 1.4371995932897616e-03 
k3_rgb 9.1192465078713847e-01

Depth

fx_d 5.9421434211923247e+02 
fy_d 5.9104053696870778e+02 
cx_d 3.3930780975300314e+02 
cy_d 2.4273913761751615e+02 
k1_d -2.6386489753128833e-01 
k2_d 9.9966832163729757e-01 
p1_d -7.6275862143610667e-04 
p2_d 5.0350940090814270e-03 
k3_d -1.3053628089976321e+00

Relative transform between the sensors (in meters)

R [ 9.9984628826577793e-01, 1.2635359098409581e-03, -1.7487233004436643e-02, 
-1.4779096108364480e-03, 9.9992385683542895e-01, -1.2251380107679535e-02,
1.7470421412464927e-02, 1.2275341476520762e-02, 9.9977202419716948e-01 ]

T [ 1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ]
Micka
  • 19,585
  • 4
  • 56
  • 74
  • 2
    I would add that `P2D_rgb.x` and `P2D_rgb.y` are indices into the RGB image (starting from (0, 0) ). If an index is out of bounds, it means that that pixel is not visible in the RGB image. – Maghoumi Feb 20 '16 at 22:42