I am reading through The R Inferno, and have run into something I do not understand. In addition to section 8.2.23 in the Inferno, there have been some good questions on comparing floating point numbers: question1, question2.
However, I am still running into a problem using all.equal
. Using the default all.equal
I get the results (mostly) as I would expect.
> all.equal(2,1.99999997)
[1] "Mean relative difference: 1.5e-08"
> all.equal(2,1.99999998) #I expected FALSE here
[1] TRUE
> all.equal(2,1.99999999)
[1] TRUE
I am not sure why at 1.99999998 the function returns TRUE
, but that is not as concerning as the following behavior where I specified the tolerance level:
> all.equal(2,1.98,tolerance=0.01) #Behaves as expected
[1] "Mean relative difference: 0.01"
> all.equal(2,1.981,tolerance=0.01) #Does not behave as expected
[1] TRUE
Furthermore,
> all.equal(2,1.980000000001,tolerance=0.01)
[1] TRUE
But if we compute:
> diff(c(1.981,2))
[1] 0.019
and clearly,
> diff(c(1.981,2)) >= 0.01
[1] TRUE
So, why is all.equal
unable to distinguish 2 and 1.981 with a tolerance of 0.01?
EDIT
From the documentation: Numerical comparisons for scale = NULL (the default) are done by first computing the mean absolute difference of the two numerical vectors. If this is smaller than tolerance or not finite, absolute differences are used, otherwise relative differences scaled by the mean absolute difference.
Here I do not understand the behavior. I can see that diff(1.981,2)
is not finite:
> sprintf("%.25f",diff(c(1.981,2)))
[1] "0.0189999999999999058530875"
But then what does it get scaled by? When each vector is of length one, the mean absolute difference should equal the difference of the two numbers, and dividing by the mean absolute difference would give 1. Clearly, I am understanding the logic here wrong.