Inconsistent output due to reordering of equation and floating point numbers

Question

c = .1
e = .3


(200 - 100) / (200 - 100) * (e - c) + c == .3

output:

[1] TRUE

But:

(e - c) * (200 - 100) / (200 - 100) + c == .3

output:

[1] FALSE

Why does reordering changes the output?

This problem has nothing to do with algebraic order of operations, as both left hand expressions will give the same result theoretically.

I suspect it might be compilation optimization that is causing the differences in results.

In the first equation, both c will just cancel out without actually doing the arithmetic inside the (e - c) part.

Whereas in the second equation the arithmetic inside (e - c) has to be computed, thus the computer has to compute .3 - .1, which leads to an imprecision error.

@markus: The OPs problem does not appear to be related to floating point arithmetic. It is, as RLave writes, raelated to order of operations/operator precedence. — AkselA, Dec 20 '18 at 13:41
"I suspect it might be compilation optimization that is causing the differences in results." I doubt that. Order of operations matters in floating point arithmetic even if equivalent mathematically. — Roland, Dec 20 '18 at 13:59
To be clear, I think you need to discuss both floating-point arithmetic and order of operations to explain this behaviour, and I can find no suitable dupe target. — AkselA, Dec 20 '18 at 14:25
@AkselA I voted to reopen so that all aspects can be considered. Best — markus, Dec 20 '18 at 14:32
The problem is that `e` and `c` are most likely not exactly equal to `0.3` and `0.1` respectively, which becomes apparent if you display enough decimals. — ira, Dec 20 '18 at 16:04

Ben Bolker · Answer 1 · 2018-12-20T18:32:47.500

Following @AkselA's example a little farther, removing the +c for simplicity:

r1 <- a / b * (e - c)
r2 <- (e - c) * a / b
r3 <- (e - c) * (a / b)
options(digits=22)
r1
## [1] 0.1999999999999999833467
r2
## [1] 0.2000000000000000111022
r3
## [1] 0.1999999999999999833467

We could simplify this still farther to d <- e-c; a/b*d == d*a/b and get the same results.

The results depend on whether the division by b is done before multiplying by (e-c) (r1, r3) or after (r2). Since floating point arithmetic is commutative but is not associative (see Wikipedia, or any of the links in this answer), we can see that r1 and r3 should indeed be identical (a/b and e-c are evaluated, then multiplied), not necessarily the same as r2 (e-c is multiplied by a, then (e-c)*a is divided by b).

these differences are not about compiler optimization as suggested by the OP (R is an interpreted language, it doesn't optimize the execution of arithmetic expressions) [also, the cancellation suggested by the OP doesn't actually work ...]
they're not about integer vs floating point operations (str(a) or storage.mode(a) show that a is a floating-point number, not an integer; use 100L if you want an integer)

Floating point arithmetic is not associative. Floating point arithmetic is not associative. Floating point arithmetic is not associative. Floating point arithmetic is not associative. Floating point arithmetic is not associative. ... — AkselA, Dec 20 '18 at 18:38

AkselA · Answer 2 · 2018-12-20T18:36:30.300

I honestly do not know what's going on. Of course the float thing is obvious, but I fail to see why reordering the equation like this should change things. I've experimented using a slightly simplified version of the equation, changing values and moving things around.

eq.1 and eq.2 are the same as the OPs, eq.3 is equivalent to eq.1 thanks to operator preference (see ?Syntax). * and / are evaluated before + and -, other than that they are evaluated from left to right, and parenthesis are resolved from inner to outer. So in eq.1 and eq.3 the order is - / * +, in eq.2 the order is - * / +.

a <- 100
b <- 100
c <- 0.1
e <- 0.3

r1 <- a / b * (e - c) + c    # [1]

r2 <- (e - c) * a / b + c    # [2]

r3 <- (e - c) * (a / b) + c  # [3]

r1 == r2  # FALSE

r1 == r3  # TRUE

sprintf("%.20f", c(r1, r2, r3))
# "0.29999999999999998890" "0.30000000000000004441" "0.29999999999999998890"

It's clear that the exact value depends on the order of operations, although in purely arithmetic terms it shouldn't matter (they're associative). But the ordering only makes a difference when you're using some specific values. If you set a and b to 10 say, or set c to 0.2, the values are identical. My hunch is that this is caused by division being done in integer mode in eq.1, and float mode in eq.2, the second introducing rounding errors the first one doesn't (contingent on starting values). The real take-home message is of course what markus linked to, you need to take extra care when comparing floats, but it would still be nice to have a definite explanation of this precise behaviour.

No, but I wanted to have the four main operators represented, and it's relevant to the OPs comment on it being "canceled out" (it isn't). — AkselA, Dec 20 '18 at 18:12

Inconsistent output due to reordering of equation and floating point numbers

2 Answers2