0

I calibrated a camera according to:

ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpointslist, imgpointslist, imsize, None, None)

which resulted in:

rvecs = array([[ 0.01375037],
               [-3.03114683],
               [-0.01097119]])
tvecs = array([[ 0.16742439],
               [-0.33141961],
               [13.50338875]])

I calculated the Rotation matrix according to:

R = cv2.Rodrigues(rvecs)[0]
R = array([[-0.99387165, -0.00864604, -0.11020157],
           [-0.00944355,  0.99993285,  0.00671693],
           [ 0.1101361 ,  0.00771646, -0.99388656]])

and created an Rt matrix resulting in:

Rt = array([[-0.99387165, -0.00864604, -0.11020157,  0.16742439],
            [-0.00944355,  0.99993285,  0.00671693, -0.33141961],
            [ 0.1101361 ,  0.00771646, -0.99388656, 13.50338875],
            [ 0.        ,  0.        ,  0.        ,  1.        ]])

Now when I try to get the position of a realworld coordinate [0, 0.4495, 0] in the image according to:

realworldpoint = array([0.    , 0.4495, 0.    , 1.    ], dtype=float32)
imagepoint = np.dot(Rt, realworldpoint)

I get:

array([ 0.16353799,  0.1180502 , 13.5068573 ,  1.        ])

Instead of my expected [1308, 965] position in the image:

array([1308,  965,  0,  1])

I am doubting about the integrity of the rotation matrix and the translation vector outputs in the calibrate camera function, but maybe I am missing something? I double checked the inputs for the OpenCV's calibrate camera function (objpointslist: 3d coordinates of the center of the April tag Aruco markers, and the imgpointslist: 2d positions of the center of the markers in the image), but these were all correct...

Could one of you help me out?

I used this procedure according to OpenCV's calibration procedure: enter image description here

EDIT (2022/02/03): I was able to solve it! Steps for solution:

  1. Solve cv2.calibrateCamera() in order to get camera intrinsics (camera matrix) and extrinsics (rotation and translation vectors)
  2. Calculate rotation matrix from rotation vector according to cv2.Rodrigues()
  3. Create the Rt matrix
  4. Use a known point (uv1, and its XwYwZw1 are known, 2D and 3D) to calculate the Scaling factor (s).

NOTE: In order to get the right scaling factor, you have to divide the final equation thus that you get [u v 1], so divide so that the 3rd element becomes a one, this results in your scaling factor:

enter image description here

Now that the scaling factor is known the same equation can be used to calculate an XYZ coordinate for a random point in the image (uv1) according to:

enter image description here

The crucial step in this solution was to calculate the scaling factor by dividing by the third element that you obtain in the [uv1] matrix. This makes that matrix actually [uv1].

The next step was to implement a solution to the lens distortion. This was easily done by using cv2.undistortPoints on the distorted uv/xy point in the image before feeding the u and v to the above equation to find the XYZ coordinate of that point.

Matt
  • 31
  • 6

2 Answers2

1

Rt is just the transformation (rotation, translation) from world to camera frame.

You expected a projection?

If you want to project the point, you still need to apply the camera matrix (and then maybe the distortion coefficients to be precise, but let's not get into that).

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
  • Thanks for reaching out again! I was able to solve the problem. The main step involved finding the scaling factor and use that to find the XYZ coordinate of a uv image point. I editted the original question with my solution! – Matt Feb 03 '22 at 10:56
1

The crucial step in the solution was to calculate the scaling factor by dividing by the third element that you obtain in the [uv1] matrix. I added an explanation to the original question. As well as explaining the next step involving correction for lens distortion.

Matt
  • 31
  • 6
  • that scaling factor is somewhat arbitrary (result of applying the projection matrix) and *individual to every calculation*. you can't and **must not** reuse it for calculations involving other points. in fact, talking about it makes no sense, because scaling such that z=1 is often called "homogenization". this stuff happens in a projective space and that operation is actually the projection. the matrix multiplication before that just sets everything up for this projection. – Christoph Rackwitz Feb 03 '22 at 14:07
  • further, you can't recover the 3D point from just a projected 2D point. it's always a line/ray. – Christoph Rackwitz Feb 03 '22 at 14:10
  • @ Christoph I am actually only interested in getting the X, and Y of the RW coordinates, as Z can be set to 0 (calibrated on a flat plate). This represents a 2D to 2D transform. The way i currently fixed it is according to these sources: https://stackoverflow.com/questions/12299870/computing-x-y-coordinate-3d-from-image-point and https://dsp.stackexchange.com/questions/46588/camera-calibration-and-extrinsic-parameters-for-perspective-transformation/46591#46591 As I'm a bit confused about your comments, what do you think about this? – Matt Feb 03 '22 at 15:07