Can I estimate a camera pose/extrinsic parameters using known object sizes instead of a plane?

Question

I may have titled this badly, if someone wants to suggest a better question I'll change it :)

I have previously calibrated camera using a ground plane with known world dimensions successfully, but now I'm trying to work out if I can calculate a camera's extrinsics (the intrinsics are known) from just identifying an object in 2D when I know its height.

Object in world space with known dimensions, and detected object in screen space Frankly, I think this can be done with trigonometry, but I've not figured it out quite yet...

Maybe I can construct triangles between two objects and determine a distance over the ground plane, then do the normal pose estimation once I have a vague plane?

I've been searching, but not found any references to algorithms for doing this from this approach... Can it be done?

Can you assume that all objects are strictly vertical and orthogonal to a common plane (hence a plane Z=constant) on which they stand ? — BConic, Feb 26 '14 at 08:35
Yeah. Assuming all objects are going straight up in both world space and in screen space, AND on a flat plane. — Soylent Graham, Feb 26 '14 at 12:18
But I cannot determine accurately determine their screen-space width, hence just illustrating with lines. — Soylent Graham, Feb 26 '14 at 12:19

score 0 · Answer 1 · answered Mar 01 '14 at 05:19

One thing you can do if objects that stand on the plane and have equal height is to determine vanishing points or in other words the points where parallel lines converge at the horizon (Infinity). Why is this useful? Consider a projection matrix P, intrinsic matrix A(hopefully known) and R|T matrix that you are looking for.

P = A*R|T, and R|T = A^-1P

From vanishing points at infinity you can determine P. The points at infinity has a property that their last homogeneous coordinate is zero. For example p=[1, 0, 0, 0]^T is a vanishing point in x direction so when you cast it into Cartesian coordinates you'll get [1/0, 0/0, 0/0] = [Inf, 0, 0]. Now if you multiply a 3x4 projection matrix with this point on the right you will get a first column of P. By the same coin finding vanishing points in y and z directions will give you a second and a third columns of P. The last column is the position of your camera center P*[0 0 0 1]T

Can I estimate a camera pose/extrinsic parameters using known object sizes instead of a plane?

1 Answers1