0

I had an apparent bug in a script. After several hours, I discovered it was a problem of decimal approximations.

To make the problem reproducible, consider this:

0.02 - 0.000904686260609299 - 0.005   ==  
                                  0.02 + (-0.000904686260609299 - 0.005)
#[1] FALSE

Where:

print(0.02 -0.000904686260609299 -0.005, 22)
print(0.02 + (-0.000904686260609299 -0.005), 22)
#[1] 0.01409531373939069964774
#[1] 0.01409531373939070138246

Imagine a situation where you have long vectors a,b,c:

a -b -c  ==  a + (-b -c)

The difference might be statistically significant.

Can I increase the level of internal approximation so as to have the test above return TRUE?

If I have to choice, which result is the best approximation: a -b -c or a + (-b -c)?

IRTFM
  • 258,963
  • 21
  • 364
  • 487
antonio
  • 10,629
  • 13
  • 68
  • 136
  • 3
    `?"=="`: "For numerical and complex values, remember == and != do not allow for the finite representation of fractions, nor for rounding error. Using all.equal with identical is almost always preferable. See the examples." – A. Webb Nov 04 '18 at 18:12
  • 3
    you can use all.equal. `all.equal(0.02 -0.000904686260609299 -0.005, 0.02 + (-0.000904686260609299 -0.005))` results in TRUE. [See this post](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal/9508558#9508558) for more info on all.equal and comparisons of numbers. – phiver Nov 04 '18 at 18:15
  • 2
    Would one of the commenters please post their comment as an answer? (Answer to last question: in general, adding/subtracting values of similar magnitudes is most accurate, so you should do those first; in the example above, probably `a-(b+c)` – Ben Bolker Nov 04 '18 at 18:21
  • Also note what `help("all.equal")` has to say: *Do not use all.equal directly in if expressions—either use isTRUE(all.equal(....))*. The *either* means it also suggests `identical`, if appropriate. – Rui Barradas Nov 04 '18 at 18:30
  • @BenBolker, I would almost say this is a sort of dupe of the post I linked to. At least the answer for this question is in one the answers of that one. – phiver Nov 04 '18 at 18:42
  • @BenBolker: this suggest that, in a script, one should use a test before on the number of digits and decide accordingly. – antonio Nov 04 '18 at 19:54

1 Answers1

1

To provide an answer to the first question (can the floating point accuracy be increased?): The gmp package provides support for arbitrary precision arithmetic.

install.packages("gmp", dependencies=TRUE)  # didn't see any dependencies
library(gmp)
z <- as.bigq(- 0.000904686260609299)  # make smallest value a big-rational
 0.02 +z -0.005 == 0.02 +(z -0.005)
#[1] TRUE

There's also an R package Rmpfr that lists system requirements:

SystemRequirements: gmp (>= 4.2.3), mpfr (>= 3.0.0)

I installed mpfr in Ubuntu using:

$ sudo apt-get install libmpfr-dev libmpfr-doc libmpfr4 libmpfr4-dbg

I didn't appear to need a system install of gmp but looking back at the console transcript I see it was done during the gmp installation and I didn't notice.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
IRTFM
  • 258,963
  • 21
  • 364
  • 487