0

Given a set of 5 cameras positioned as shown in the image below which capture the top, front, rear, left and right views of an object placed in the center.

enter image description here

Also given that the origin of the world coordinate is assumed to be the top view (therefore used as the reference view), how do I go about calculating the rotation and translation (external parameters of the cameras) of all other 4 cameras relative to this top camera. The front, rear, left and right cameras have also been slanted 45 degrees (about the x axis) to capture the object in the middle.

The calculation of the external parameters will later be used to calculate the projection matrix for each camera (the internal parameters are known)

Hassan Zaidi
  • 41
  • 1
  • 7
  • The question isn't very complete. What do you mean by "slanted?" What do you mean by "rotation and translation?" Assuming one possible definition of "slanted," they're rotated about the x or y axis by + or - pi/4, translated 25 cm down, then 20cm left, right, forward, or back. – Gene Apr 28 '22 at 00:12
  • By slanted I mean about the x axis and the rotation and translations are just the external parameters of the camera. I have edited the question for according to the ambiguities you've mentioned. – Hassan Zaidi Apr 28 '22 at 00:31
  • Just calibrate. What are the factors that actually prevent you from doing calibration? – fana Apr 28 '22 at 03:17
  • The translation is pretty simple and straightforward, its the rotation that I am confused about. Because I have to rotate the axes of a camera with reference to the top view I am not quite sure what the resultant rotation matrix would look like. – Hassan Zaidi Apr 28 '22 at 11:19
  • If you're talking about concatenating homogenous matrices - which is normal - then you want to define coordinates in camera space, then rotate about that _first_. Then translate the camera origin to the camera's location. Remember matrices are multiplied in the reverse order of transformation. If Rc is the camera rotation and Tc is the camera translation, then the transformation is Tc * Rc. – Gene Apr 29 '22 at 00:05

1 Answers1

0

Calibrate the extrinsic parameters with respect to an object of known shape and size which is visible to all cameras, or at least to all pairs of (reference camera, current camera).

For best results use a 3D object, not a plane. For example, a box with three unequal sides, or a dodecahedron. The latter would allow you to calibrate all cameras simultaneously, since each of them should see three faces at least. Depending on your accuracy requirements, you may need to spend some real money on getting this object machined accurately.

As for software, you can of course whip up a script to do it using OpenCV, or just use a CG tool like Blender, where visualization of the results may be much easier.

Francesco Callari
  • 11,300
  • 2
  • 25
  • 40