0

In the question "What's the numerically best way to calculate the average" it was suggested, that calculating a rolling mean, i.e.

mean =  a[n]/n + (n-1)/n * mean

might be numerically more stable than calculating the sum and then dividing by the total number of elements. This was questioned by a commenter. I can not tell which one is true - can someone else? The advantage of the rolling mean is, that you keep the mean small (i.e. at roughly the same size of all vector entries). Intuitively this should keep the error small. But the commenter claims:

Part of the issue is that 1/n introduces errors in the least significant bits, so n/n != 1, at least when it is performed as a three step operation (divide-store-multiply). This is minimized if the division is only performed once, but you'd be doing it over GB of data.

So I have multiple questions:

  1. Is the rolling mean more precise than summing and then dividing?
    • Does that depend on the question whether 1/n is calculated first and then multiplied?
    • If so, do computers implement a one step division? (I thought so, but I am unsure now)
  2. If yes, is it more precise than Kahan summation and then dividing?
  3. If compareable - which one is faster? In both cases we have additional calculations.
  4. If more precise, could you use this for precise summation?
Felix B.
  • 905
  • 9
  • 23

1 Answers1

2
  1. In many circumstances, yes. Consider a sequence of all positive terms, all on the same order of magnitude. Adding them all generates a large intermediate sum, to which we add small terms, which might round precisely to the intermediate sum. Using the rolling sum, you get terms on the same order of magnitude, and in addition, the sum is much harder to overflow. However, this is not open and shut: Adding the terms and then dividing allows us to use AVX instructions, which are significantly faster than the subtract/divide/add instructions of the rolling loop. In addition, there are distributions which cause one or the other to be more accurate. This has been examined in:

Robert F Ling. Comparison of several algorithms for computing sample means and variances. Journal of the American Statistical Association, 69(348): 859–866, 1974

  1. Kahan summation is an orthogonal issue. You can apply Kahan summation to the sequence x[n] = (x[n-1]-mu)/n; this is very accurate.
user14717
  • 4,757
  • 2
  • 44
  • 68
  • The rolling mean suggested in the question seems to be M_13 in this paper, which seems to be the only algorithm which is worse than M_14 independent of the setting. Although I wonder whether they really implemented M_{13} as `m=(1/n)a[n] ...` instead of `m=a[n]/n....` since that should impact its performance I think. So I am not sure why you answered the 1. question positively given your citation. Since M_14 is also a rolling mean of sorts, you might have referred to that instead. But that is not clear from your answer – Felix B. May 08 '20 at 12:30
  • For AVX you could do a sort of batch rolling mean, e.g. `mean=(a[n]+...+a[n+k])/(n+k) + n/(n+k)mean` or in order to obtain a M_14 esque rolling mean: `mean=(a[n]+...+a[n+k] - k*mean)/(n+k) + mean`. So that problem can probably avoided. – Felix B. May 08 '20 at 12:34
  • I am not sure what you mean by applying Kahan summation to `x[n] = (x[n-1]-mu)/n` what is mu? Do you refer to a mean of the form `mean= c + sum_n (a[n]-c)/n` e.g. M_12 or M_21 in the paper? – Felix B. May 08 '20 at 12:38
  • and lastly (4.), if you would combine kahan with some sort of rolling mean (assuming it actually improved accuracy, then could you actually improve summing accuracy by calculating the mean instead and then multiplying by n in the end?) My updated guess based on the paper is: Given that the other algorithms (which performed better in most cases than the rolling means) basically divide every number by n beforehand, this probably only shifts the exponent temporarily and does not gain any accuracy. – Felix B. May 08 '20 at 12:45