3

For example for 6-dof camera states, two cameras have 12 state parameters and a 12*12 covariance matrix (assume Gaussian distribution). How does this covariance change when a 6-dof rigid motion is applied to the cameras?

What if the 6-dof is a Gaussian distribution too ?

Ash
  • 4,611
  • 6
  • 27
  • 41
Mr.Guo
  • 167
  • 1
  • 9

1 Answers1

3

You can use the "forward propagation" theorem (you can find it in Hartley and Zisserman's multiple view geometry book, chapter 5, page 139).

Basically, if you have a random variable x with mean x_m and covariance C, and a differentliable function f that you apply to x, then the mean of f(x) will be f(x_m) and its covariance C_f will be approximately JCJ^t, where ^t denotes the transpose, and Jis the Jacobian matrix of f evaluated at x_m.

Let's now consider the problems of the covariance propagation separately for camera positions and camera orientations.

  • First see what happens to the translation parameters of the camera in your case, let's denote them with x_t.In your case, f is a rigid transformation, that means that

    f(x_t)=Rx_t+T //R is a rotation and T a translation, x_t is the position of the camera
    

    Now the Jacobian of f with respect to x_t is simply R, so the covariance is given by

    C_f=RCR^T
    

    which is an interesting result: it indicates that the change in covariance only depends on the rotation. This makes sense, since intuitively, translating the (positional) data doesn't actually changes the axis along which it changes (thing about principal component analysis).

    Also note that if C is isotropic, i.e a diagonal matrix lambda*Identity, then C_f=lambda*Identity, which also makes sense, since intuitively we don't expect an isotropic covariance to change with a rotation.

  • Now consider the orientation parameters. Let's use the Lie algebra of the SO(3) group. In that case, the yaw, pitch, scale will be parametrized as v=[alpha_1, alpha_2, alpha_3]^t (they are basically Lie algebra coefficients). In the following, we will use the exponential and logarithm maps from the Lie algebra so(3) to the group SO(3). We can write our function as

    f(v)=log(R*exp(v)) 
    

    In the above, exp(v) is the rotation matrix of your camera, and R is the rotation from your rigid transformation. Note that translation doesn't affect orientation parameters. Computing the Jacobian of f with respect to v is mathematically involved. I suspect that you can do it using the adjoint or the Lie algebra, or you can do it using the Baker-Campbell-Hausdorff formula, however, you will have to limit the precision. Here, we'll take a shortcut and use the result given in this question.

    jacobian_f_with_respect_to_v=R*inverse(R*exp(v))
    =R*exp(v)^t*R^t
    

    So, our covariance will be

    R*exp(v)^t*R^t *  Cov(v) * (R*exp(v)^t*R^t)^t
    =R*exp(v)^t*R^t * Cov(v) * R * exp(v) * R^t
    

    Again, we observe the same thing: if Cov(v) is isotropic then so is the covariance of f.

Edit: Answers to the questions you asked in the comments

  • Why did you assume conditional independence between translation/rotation?

    Conditional independence between translation/orientation parameters is often assumed in many works (especially in the pose graphe litterature, e.g. see Hauke Strasdat's thesis), and I've always found that in practice, this works a lot better (not a very convincing argument, I know). However, I admit that I didn't put much thought (if any) into this when writing this answer, because my main point was "use the forward propagation theorem". You can apply it jointly to orientation/position, and all this changes is that your Jacobian will look like

    J=[J_R J_T]//J_R Jacobian w.r.t orientation , J_T Jacobian w.r.t position
    

    and then the "densification" of the covariance matrix will happen as a result of the propagation like J^T*C*J.

  • Why did you use SO(3) instead of SE(3)?

    You said it yourself, I separated the translation parameters from the orientation. SE(3) is the space of rigid transformation, which includes translations. It wouldn't have made sense for me to use it since I already had taken care of the position parameters.

  • What about the covariance between two cameras?

    I think we can still apply the same theorem. The difference now is your rigid transformation will be a function M(x_1,x_2) of 12 parameters, and your Jacobian will look like [J_R_1 J_R_2 J_T_1 J_T2]. These can be tedious to compute as you know, so if you can just try numeric or automatic differentiation.

Ash
  • 4,611
  • 6
  • 27
  • 41
  • thank you for your detailed explanation. but I have a question, you consider the position and orientation separately, what if they are corresponding? and why do you use so3 instead of se3. – Mr.Guo Mar 24 '18 at 01:18
  • and more important, If I have two cameras as my question explain, I think a transform would influence the co-variance between them, how to deduce this change? – Mr.Guo Mar 24 '18 at 01:20
  • @Mr.Guo My pleasure, thanks for these observations! I updated the answer to address your questions as well as I could. Don't hesitate to correct me if you find something that doesn't make sense. – Ash Mar 24 '18 at 13:25
  • thank you for your supplement, as I think, se3 may be a more compact expression for rigid transformation. but in practice most of slam use so3+position parameterization instead of se3 ,as equ(21) in [SVO2](https://www.cc.gatech.edu/~dellaert/pubs/Forster16tro.pdf) ,I don't know if this parameterization method is the same as your deduce? it looks similar :) – Mr.Guo Mar 26 '18 at 02:17
  • @Mr.Guo Oh no not that paper again! xD I've had to go through is so many times these past few weeks that I just want to kill myself when I see it. But what is SVO2? Is it an opensource implementation or something (if so could you please share the link)? Anyway, I might be mistaken, but I think that SO(3)+translation is exactly the same as SE(3), sot it doesn't matter which one you chose, all it affects is the form of your formulas. – Ash Mar 26 '18 at 09:34
  • Also important to note that none of this holds up for *large* rotations, because the first order Taylor series approximation of the rotation breaks down. This kind up update is suitable for small, incremental updates to camera positions. – Ben Jackson Apr 09 '18 at 23:14
  • @BenJackson Thanks for the comment... Sorry if I'm missing something, but I don't think I used the Taylor series approximation. I didn't read the proof of the derivative formula I linked to in detail, but it didn't seem to make that assumption. Might be that I'm just tired, but I just don't see it. The `Rexp(v)` part doesn't make that assumption either... – Ash Apr 12 '18 at 13:21
  • @Ash The Jacobian in your covariance update comes from approximating the function as linear around the current state. – Ben Jackson Apr 12 '18 at 17:07