0

The deviations of the mean should always sum up to 0. However, when the mean has a lot of digits, maybe infinitely like this one which is 20/7, R fails to calculate it.

x <- c(1,2,2,3,3,4,5)
sum(x - mean(x))

[1] -4.440892e-16

I am quite a newbie and have not found any information about this so far, maybe I was not searching for the right terms. Is it possible to calculate with infinitely long numbers in R? I am asking this out of theoretical interest.

uke
  • 462
  • 1
  • 11
  • 1
    Does this answer your question? [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – user438383 Nov 22 '22 at 09:31

2 Answers2

2

The problem you have described is a general problem with all programming languages. Internally all floats are based on the IEEE754 convention. You can read more about it here.

As far as I know there is no easy way around these small errors, except for using number representations with higher precision.

EDIT: R already used the double precision representation of floating point numbers. To read more about it you can have a look at the R FAQ and this SO question.

Marcello Zago
  • 629
  • 5
  • 19
2
  • If you deal with rational numbers only, such as your example, you can use the gmp package.

  • You can use the Rmpfr package to deal with numbers with an arbitrary precision (that you have to set).

  • Another possibility is the lazyNumbers package, freshly released on CRAN:

library(lazyNumbers)

# create a vector of lazy numbers
x <- lazyvec(c(1, 2, 2, 3, 3, 4, 5))
# compute its mean
m <- sum(x) / length(x)
# sum expected to be 0
y <- sum(x - m)
# convert it to double
as.double(y)
## 0
Stéphane Laurent
  • 75,186
  • 15
  • 119
  • 225