When I add a bunch of floating-point numbers with JavaScript, what is the error bound on the sum? What error bound should be used to check if two sums are equal?
In a simple script, I add a bunch of floating-point numbers and compare sums. I notice that sometimes the result is not correct (two sums that should be equal are not). I am pretty weak at numerical analysis, but even after reviewing Is floating point math broken? and What Every Computer Scientist Should Know About Floating-Point Arithmetic and Comparing Floating Point Numbers, 2012 Edition I am confused about how best to compare floating-point sums in JavaScript.
First, I was confused by: The IEEE standard requires that the result of addition, subtraction, multiplication and division be exactly rounded (as if they were computed exactly then rounded to the nearest floating-point number). If JavaScript is based on the IEEE standard, how can 0.1 + 0.2 != 0.3?
I think I answered this for myself: It's easier for me to think about an example in base 10. If 1/3 is approximated 0.333...333 and 2/3 is approximated 0.666...667, 1/3 + 1/3 = 0.666...666 is exactly rounded (it is the exact sum of two approximations) but != 0.666...667. Intermediate results of exactly rounded operations are still rounded, which can still introduce error.
How big is machine epsilon? JavaScript floating-point numbers are apparently 64-bits, and apparently IEEE double precision format machine epsilon is about 1e-16?
When I add a bunch (n) of floating-point numbers (naive summation, without pairwise or Kahan summation), what is the error bound on the sum? Intuitively it is proportional to n. The worst-case example I can think of (again in base 10) is 2/3 - 1/3 - 1/3 + 2/3 - 1/3 - 1/3 + etc. I think each iteration will increment the error term by 1 ULP while the sum remains zero, so both the error term and relative error will grow without bound?
In the section "Errors in Summation" Goldberg is more precise (error term is bounded by n * machine epsilon * sum of the absolute values) but also points out that if the sum is being done in an IEEE double precision format, machine epsilon is about 1e-16, so n * machine epsilon will be much less than 1 for any reasonable value of n (n much less than 1e16). How can this error bound be used to check if two floating-point sums are equal? What relationship between the sums, 1, 1e-16, n, etc. must be true if they are equal?
Another intuition: If the bunch of numbers are all positive (mine are) then although the error term can grow without bound, the relative error will not, because the sum must grow at the same time. In base 10, the worst-case example I can think of (in which the error term grows fastest while the sum grows slowest) is if 1.000...005 is approximated 1.000...000. Repeatedly adding this number will increment the error term by 1/2 ULP (of the summand, 0.000...005) while incrementing the sum by 1 first place unit. The worst relative error is 4.5 ULP (0.000...045, when the sum is 9.000...000) which is (base - 1) / 2 ULP which is 1/2 ULP in base 2?
If two floating-point sums are equal, then their absolute difference must be less than twice the error bound, which is 1 ULP in base 2? So in JavaScript, Math.abs(a - b) < a * 1e-16 + b * 1e-16?
Comparing Floating Point Numbers, 2012 Edition describes another technique for comparing floating-point numbers, also based on relative error. In JavaScript, is it possible to find the number of representable numbers between two floating-point numbers?