0

Disclaimer

I was not sure whether to post that here or on CV but after having read what is on topic on CV I think it is more R specific then purely statistical. Thus, I posted it here.

Problem

Citing from ?.Machine

double.eps
the smallest positive floating-point number x such that 1 + x != 1. It equals double.base ^ ulp.digits if either double.base is 2 or double.rounding is 0; otherwise, it is (double.base ^ double.ulp.digits) / 2. Normally 2.220446e-16.

Thus, I would assume that all.equal(1 + .Machine$double.eps, 1.0) returns FALSE. Which it does not. Reading the doc of all.equal I see that the default tolerance is .Machine$double.eps ^ 0.5.

Fair enough, but I observe some odd results which I do not understand:

isTRUE(all.equal(1.0 + .Machine$double.eps, 1.0, tolerance = .Machine$double.eps)) # TRUE
isTRUE(all.equal(1.0 - .Machine$double.eps, 1.0, tolerance = .Machine$double.eps)) # FALSE
isTRUE(all.equal(0.9 + .Machine$double.eps, 0.9, tolerance = .Machine$double.eps)) # FALSE
isTRUE(all.equal(2.0 + .Machine$double.eps, 2.0, tolerance = .Machine$double.eps)) # TRUE

Thus, all.equal picks only differences for numbers below 1 correctly.

Last explanation I could think of is that all.equal looks on the relative difference scale by default, so I tried to overrule this behaviour with no success either:

isTRUE(all.equal(1.0 + .Machine$double.eps, 1.0, 
                 tolerance = .Machine$double.eps, scale = 1)) # TRUE

Apparently, I have a massive misunderstanding of how floating-point numbers work in R, which leads me to these

Questions

  • How to compare 2 numbers in R with "maximum" (wrt to floating point precision) precision correctly?
  • Why are the results of all.equal different for numbers below and above 1?
  • [Bonus Question]: what is the rational to use .Machine$double.eps ^ .5 as default tolerance instead of the unsquarerooted version? Is it simply to relax the test a bit?
thothal
  • 16,690
  • 3
  • 36
  • 71
  • 1
    Have a look at [What is the correct/standard way to check if difference is smaller than machine precision?](https://stackoverflow.com/q/59229545/10488504). – GKi Apr 22 '20 at 08:07
  • 2
    `Machine$double.eps` the smallest positive floating-point number `x` such that `1 + x != 1`. For `0.1 + x != 0.1` x is smaller than `Machine$double.eps` and for `100 + x != 100` x is larger than `Machine$double.eps` – GKi Apr 22 '20 at 08:11
  • 1
    To compare 2 numbers in R with "maximum" precision correctly use `==`. – GKi Apr 22 '20 at 08:13
  • Wow, thanks makes a lot of sense now. But why is `all.equal(1.0 + .Machine$double.eps, 1.0, tolerance = .Machine$double.eps)` still `TRUE`? – thothal Apr 22 '20 at 08:15
  • 2
    As you give here a tolerance of `.Machine$double.eps`. – GKi Apr 22 '20 at 08:18
  • Ah stupid me, of course makes sense now. Would you bother to summarise your comments in an answer so I can accept them? Thanks for your enlightening! – thothal Apr 22 '20 at 08:24
  • “The smallest positive floating-point number `x` such that `1+x != 1` is an incorrect definition of `.Machine$double.eps`, even if the documentation says it. The correct definition is that it is the difference between 1 and the smallest representable value greater than 1, *i.e.*, the distance between representable values at the magnitude of 1. This is 2\*\*−52 in IEEE-754 binary64. With the incorrect definition,`x` would be 2\*\*−53+2\*\*−105, because adding this `x` to `1` results in a rounding up to the next representable value, 1+2\*\*−52. – Eric Postpischil Apr 22 '20 at 11:02
  • There is **no** good general way to compare numbers that contain errors due to rounding, because the rounding errors that may occur can range from zero to infinity (or NaN) depending on the algorithms and data involved, so there can be no default tolerance that is generally correct, because any tolerance reduces false negatives at the expense of allowing false positives and that will have different consequences for different applications, and because making a tolerance too small will allow false negatives and that will have different consequences on different applications. – Eric Postpischil Apr 22 '20 at 11:07
  • 1
    Re “[Bonus Question]: what is the rational to use .Machine$double.eps ^ .5 as default tolerance instead of the unsquarerooted version? Is it simply to relax the test a bit?”: The only rational is that a lot of people do simple things with floating-point, get only a few well-behaved rounding errors, and do not need to distinguish between numbers that would, if computed with real-number arithmetic, be very close but not equal. In essence, this default tolerance is suitable only for amateurs tinkering and breaks when complicated work is involved. Avoid using it. – Eric Postpischil Apr 22 '20 at 11:10

1 Answers1

1

.Machine$double.eps is the difference between 1 and the smallest representable value greater than 1. The difference between 0.1 and the smallest representable value greater than 0.1 is smaller than .Machine$double.eps and the difference between 100 and the smallest representable value greater than 100 is larger than .Machine$double.eps. Have a look at: What is the correct/standard way to check if difference is smaller than machine precision?.

.Machine$double.eps is

.Machine$double.eps
[1] 2.220446e-16

When you make the calculations the intern sotred values will be approximately:

print(1.0 + .Machine$double.eps, 20)
#[1] 1.000000000000000222
print(1.0 - .Machine$double.eps, 20)
#[1] 0.99999999999999977796
print(0.9 + .Machine$double.eps, 20)
#[1] 0.90000000000000024425
print(2.0 + .Machine$double.eps, 20)
#[1] 2

Using tolerance = .Machine$double.eps all.equal returns TRUE or FALSE depending if the difference of the intern stored values is lager or not than tolerance.

To compare 2 numbers in R if the are intern stored equal use ==.

GKi
  • 37,245
  • 2
  • 26
  • 48
  • Your last sentence doesn't make sense. What is '"maximum" precision'? You should never compare calculated floating point numbers with ==. – Roland Apr 22 '20 at 09:39
  • I changed it to `if the are intern stored equal`. – GKi Apr 22 '20 at 09:50
  • “The smallest positive floating-point number `x` such that `1+x != 1` is an incorrect definition of `.Machine$double.eps`, even if the documentation says it. The correct definition is that it is the difference between 1 and the smallest representable value greater than 1, *i.e.*, the distance between representable values at the magnitude of 1. This is 2\*\*−52 in IEEE-754 binary64. With the incorrect definition,`x` would be 2\*\*−53+2\*\*−105, because adding this `x` to `1` results in a rounding up to the next representable value, 1+2\*\*−52. – Eric Postpischil Apr 22 '20 at 11:00