R: approximation error with simple algebraic sums

Question

I had an apparent bug in a script. After several hours, I discovered it was a problem of decimal approximations.

To make the problem reproducible, consider this:

0.02 - 0.000904686260609299 - 0.005   ==  
                                  0.02 + (-0.000904686260609299 - 0.005)
#[1] FALSE

Where:

print(0.02 -0.000904686260609299 -0.005, 22)
print(0.02 + (-0.000904686260609299 -0.005), 22)
#[1] 0.01409531373939069964774
#[1] 0.01409531373939070138246

Imagine a situation where you have long vectors a,b,c:

a -b -c  ==  a + (-b -c)

The difference might be statistically significant.

Can I increase the level of internal approximation so as to have the test above return TRUE?

If I have to choice, which result is the best approximation: a -b -c or a + (-b -c)?

`?"=="`: "For numerical and complex values, remember == and != do not allow for the finite representation of fractions, nor for rounding error. Using all.equal with identical is almost always preferable. See the examples." — A. Webb, Nov 04 '18 at 18:12
you can use all.equal. `all.equal(0.02 -0.000904686260609299 -0.005, 0.02 + (-0.000904686260609299 -0.005))` results in TRUE. [See this post](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal/9508558#9508558) for more info on all.equal and comparisons of numbers. — phiver, Nov 04 '18 at 18:15
Would one of the commenters please post their comment as an answer? (Answer to last question: in general, adding/subtracting values of similar magnitudes is most accurate, so you should do those first; in the example above, probably `a-(b+c)` — Ben Bolker, Nov 04 '18 at 18:21
Also note what `help("all.equal")` has to say: *Do not use all.equal directly in if expressions—either use isTRUE(all.equal(....))*. The *either* means it also suggests `identical`, if appropriate. — Rui Barradas, Nov 04 '18 at 18:30
@BenBolker, I would almost say this is a sort of dupe of the post I linked to. At least the answer for this question is in one the answers of that one. — phiver, Nov 04 '18 at 18:42
@BenBolker: this suggest that, in a script, one should use a test before on the number of digits and decide accordingly. — antonio, Nov 04 '18 at 19:54

score 1 · Accepted Answer · edited Nov 04 '18 at 21:43

To provide an answer to the first question (can the floating point accuracy be increased?): The gmp package provides support for arbitrary precision arithmetic.

install.packages("gmp", dependencies=TRUE)  # didn't see any dependencies
library(gmp)
z <- as.bigq(- 0.000904686260609299)  # make smallest value a big-rational
 0.02 +z -0.005 == 0.02 +(z -0.005)
#[1] TRUE

There's also an R package Rmpfr that lists system requirements:

SystemRequirements: gmp (>= 4.2.3), mpfr (>= 3.0.0)

I installed mpfr in Ubuntu using:

$ sudo apt-get install libmpfr-dev libmpfr-doc libmpfr4 libmpfr4-dbg

I didn't appear to need a system install of gmp but looking back at the console transcript I see it was done during the gmp installation and I didn't notice.

R: approximation error with simple algebraic sums

1 Answers1