How to compute standard error from ODR results?

Question

I use scipy.odr in order to make a fit with uncertainties on both x and y following this question Correct fitting with scipy curve_fit including errors in x?

After the fit I would like to compute the uncertainties on the parameters. Thus I look at the square root of the diagonal elements of the covariance matrix. I get :

>>> print(np.sqrt(np.diag(output.cov_beta)))
[ 0.17516591  0.33020487  0.27856021]

But in the Output there is also output.sd_beta which is, according to the doc on odr

Standard errors of the estimated parameters, of shape (p,).

But, it does not give me the same results :

>>> print(output.sd_beta)
[ 0.19705029  0.37145907  0.31336217]

EDIT

This is an example on a notebook : https://nbviewer.jupyter.org/github/gvallverdu/cookbook/blob/master/fit_odr.ipynb

With least square

stop reason: ['Sum of squares convergence']
        params: [ -1.94792946  11.03369235  -5.43265555]
          info: 1
       sd_beta: [ 0.26176284  0.49877962  0.35510071]
sqrt(diag(cov): [ 0.25066236  0.47762805  0.34004208]

With ODR

stop reason: ['Sum of squares convergence']
        params: [-1.93538595  6.141885   -3.80784384]
          info: 1
       sd_beta: [ 0.6941821   0.88909997  0.17292514]
sqrt(diag(cov): [ 0.01093697  0.01400794  0.00272447]

attribute `output.stopreason` is 'Iteration limit reached'. I did explicit odr with `fit_type=0`. The fit looks like good but maybe the number of iteration is not enough ! `output.info` is 4. — Ger, Dec 07 '16 at 23:46
I don't think you can trust `cov_beta` and `sd_beta` if ODR hasn't converged on a solution. Try setting `maxiter` to something large, or check `output.stopreason` and call `output = odr.reset()` if ODR hasn't converged yet. — ali_m, Dec 08 '16 at 00:21
Yes, without convergence you can talk nonsense. I run odr until it reaches convergence. I edit the question with the notebook example. With the least square approach `sd_beta` et `cov_beta` are closed each other. But with explicit ODR there are strong differences. — Ger, Dec 08 '16 at 09:20
Interesting. It seems that sqrt(diag(sd_cov)) is a scalar multiple of sd_beta (in your example the scaling parameter is 0.01575519). — ali_m, Dec 08 '16 at 20:28
Yes, I look at two other examples on more "real" data and again `sb_beta` and `sqrt(diag(sb_cov))` are proportional. But the coefficient is not always the same, I got 0.6981417 or 0.737385 — Ger, Dec 08 '16 at 21:05

ali_m · Accepted Answer · 2016-12-08T22:09:16.383

12

The reason for the discrepancy is that sd_beta is scaled by the residual variance, whereas cov_beta isn't.

scipy.odr is an interface for the ODRPACK FORTRAN library, which is thinly wrapped in __odrpack.c. sd_beta and cov_beta are recovered by indexing into the work vector that's used internally by the FORTRAN routine. The indices of their first elements in work are variables named sd and vcv (see here).

From the ODRPACK documentation (p.85):

WORK(SDI) is the first element of a p × 1 array SD containing the standard deviations ̂σβK of the function parameters β, i.e., the square roots of the diagonal entries of the covariance matrix, where
WORK(SDI-1+K) = SD(K) = ̂V 1/2 β (K, K) = ̂σβK
for K = 1,... ,p.

WORK(VCVI) is the first element of a p × p array VCV containing the values of the covariance matrix of the parameters β prior to scaling by the residual variance, where
WORK(VCVI-1+I+(J-1)*(NP)) = VCV(I,J) = ̂σ⁻²V β(I, J)
for I = 1,... ,p and J = 1,... ,p.

In other words, np.sqrt(np.diag(output.cov_beta * output.res_var)) will give you the same result as output.sd_beta.

I've opened a bug report here.

edited Dec 08 '16 at 22:09

answered Dec 08 '16 at 21:28

ali_m

71,714
23
223
298

Looking at the doc of matlab, the standard error (stdx) seems to be equivalent to the sd_beta of scipy.odr : https://fr.mathworks.com/help/matlab/ref/lscov.html Matlab does multiply by the error. – Ger Dec 09 '16 at 13:43
The line `np.sqrt(np.diag(output.cov_beta * output.res_var))` indeed returns the same value as `output.sd_beta`, but what's not clear to me is: which of these two outputs is the proper standard deviation for the fitted parameters? (Asked here: https://stackoverflow.com/questions/44638882/estimate-the-standard-deviation-of-fitted-parameters-in-scipy-odr) – Gabriel Jun 21 '17 at 19:22
ali_m shouldn't it be the other way around? I.e.: "*`cov_beta` is scaled by the residual variance, whereas `sd_beta` isn't*". – Gabriel Jun 22 '17 at 17:08
2

@Gabriel No, `output.cov_beta` *ought* to be scaled by the residual variance, but it isn't by default (hence why you need to multiply it by `output.res_var`). – ali_m Jun 22 '17 at 18:31

How to compute standard error from ODR results?

1 Answers1

Linked