1

I know floating point numbers are strange, but I haven't come across this exact issue before. I have a vector of numbers in R. I see how many are bigger than zero, and I take the mean of this to get the proportion above zero. I assign the number to an object after rounding it. When I go to paste it, somehow the numbers come back. I would dput the vector, but it is too long to do so, but here's the head and str:

> head(x)
[1] 0.1616631 0.2117250 0.1782197 0.1791657 0.2067048 0.2042075
> str(x)
 num [1:4000] 0.162 0.212 0.178 0.179 0.207 ...

Now here's where I run into issues:

> y <- round(mean(x > 0) * 100, 1)

> y
[1] 99.7

> str(y)
 num 99.7

> paste(100 - y, "is the inverse")
[1] "0.299999999999997 is the inverse"

But it doesn't behave the same if I don't subtract from 100:

> paste(y, "is it pasted")
[1] "99.7 is it pasted"

I know I could put round right into the paste command or use sprintf, and I know how floats are represented in R, but I'm specifically wondering why it occurs for the former situation and not the latter? I cannot get a reproducible example, either, because I cannot get a randomly-generated vector to behave in the same way.

Mark White
  • 1,228
  • 2
  • 10
  • 25
  • 2
    Possible duplicate of [Why are these numbers not equal?](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal) – dan04 Dec 13 '18 at 20:21
  • @dan04 Thanks! I've read that thread before, but it seems to me like `paste` should treat both `y` and `100 - y` the same then. To be more specific, why does it print the decimals for `100 - y`, but not `y`? – Mark White Dec 13 '18 at 20:23
  • It appears that R's default floating-point stringification uses the minimum number of decimal places that can recreate the original IEEE 754 `double` value. A literal like `99.7` (which is actually 99.7000000000000028421709430404007434844970703125) will round-trip as-is. OTOH, `100 - 99.7` evaluates to 0.2999999999999971578290569595992565155029296875, which is not the same as `0.3` (actually 0.299999999999999988897769753748434595763683319091796875). – dan04 Dec 13 '18 at 20:30

1 Answers1

4

There's rounding error, but in this case R is not handling it nicely.

Any representation of floating-point numbers in R is done as double, which means 53 bits of precision, approximately 16 digits. That's also for the 99.7, you can see where it breaks down:

print(99.7, digits=16) # works fine
print(99.7, digits=17) # Adds a 3 at the end on my platform

That will be always a limit, which you are warned for when specifying it in print (in the docs).

But when you do calculations, any rounding error remains absolute, meaning your expected value of .3 has an absolute error that is just as big, but that is relatively 300 times larger. Therefore it "fails" with less significant digits:

print(100-99.7, digits=14) # works fine
print(100-99.7, digits=15) # Allready rounding error at digits=15

Now paste passes any number to the function as.character, which (in this case unfortunately) does not look at any options you've set, it always uses a default value of 15 significant digits.

To solve it, you can use format to specify the desired number of digits:

paste(format(100 - y, digits=14), "is the inverse")
Emil Bode
  • 1,784
  • 8
  • 16