You can use the "forward propagation" theorem (you can find it in Hartley and Zisserman's multiple view geometry book, chapter 5, page 139).
Basically, if you have a random variable x
with mean x_m
and covariance C
, and a differentliable function f
that you apply to x
, then the mean of f(x)
will be f(x_m)
and its covariance C_f
will be approximately JCJ^t
, where ^t
denotes the transpose, and J
is the Jacobian matrix of f
evaluated at x_m
.
Let's now consider the problems of the covariance propagation separately for camera positions and camera orientations.
First see what happens to the translation parameters of the camera in your case, let's denote them with x_t
.In your case, f
is a rigid transformation, that means that
f(x_t)=Rx_t+T //R is a rotation and T a translation, x_t is the position of the camera
Now the Jacobian of f
with respect to x_t
is simply R
, so the covariance is given by
C_f=RCR^T
which is an interesting result: it indicates that the change in
covariance only depends on the rotation. This makes sense, since
intuitively, translating the (positional) data doesn't actually changes the axis
along which it changes (thing about principal component
analysis).
Also note that if C
is isotropic, i.e a diagonal matrix
lambda*Identity
, then C_f=lambda*Identity
, which also makes sense,
since intuitively we don't expect an isotropic covariance to change
with a rotation.
Now consider the orientation parameters. Let's use the Lie algebra of the SO(3)
group. In that case, the yaw, pitch, scale
will be parametrized as v=[alpha_1, alpha_2, alpha_3]^t
(they are basically Lie algebra coefficients). In the following, we will use the exponential and logarithm maps from the Lie algebra so(3)
to the group SO(3)
. We can write our function as
f(v)=log(R*exp(v))
In the above, exp(v) is the rotation matrix of your camera, and R
is the rotation from your rigid transformation.
Note that translation doesn't affect orientation parameters. Computing the Jacobian of f
with respect to v
is mathematically involved. I suspect that you can do it using the adjoint or the Lie algebra, or you can do it using the Baker-Campbell-Hausdorff formula, however, you will have to limit the precision. Here, we'll take a shortcut and use the result given in this question.
jacobian_f_with_respect_to_v=R*inverse(R*exp(v))
=R*exp(v)^t*R^t
So, our covariance will be
R*exp(v)^t*R^t * Cov(v) * (R*exp(v)^t*R^t)^t
=R*exp(v)^t*R^t * Cov(v) * R * exp(v) * R^t
Again, we observe the same thing: if Cov(v) is isotropic then so is the covariance of f
.
Edit: Answers to the questions you asked in the comments
Why did you assume conditional independence between translation/rotation?
Conditional independence between translation/orientation parameters is often assumed in many works (especially in the pose graphe litterature, e.g. see Hauke Strasdat's thesis), and I've always found that in practice, this works a lot better (not a very convincing argument, I know). However, I admit that I didn't put much thought (if any) into this when writing this answer, because my main point was "use the forward propagation theorem". You can apply it jointly to orientation/position, and all this changes is that your Jacobian will look like
J=[J_R J_T]//J_R Jacobian w.r.t orientation , J_T Jacobian w.r.t position
and then the "densification" of the covariance matrix will happen as a result of the propagation like J^T*C*J
.
Why did you use SO(3)
instead of SE(3)
?
You said it yourself, I separated the translation parameters from the orientation. SE(3)
is the space of rigid transformation, which includes translations. It wouldn't have made sense for me to use it since I already had taken care of the position parameters.
What about the covariance between two cameras?
I think we can still apply the same theorem. The difference now is your rigid transformation will be a function M(x_1,x_2)
of 12
parameters, and your Jacobian will look like [J_R_1 J_R_2 J_T_1 J_T2]
. These can be tedious to compute as you know, so if you can just try numeric or automatic differentiation.