0
c = .1
e = .3


(200 - 100) / (200 - 100) * (e - c) + c == .3

output:

[1] TRUE

But:

(e - c) * (200 - 100) / (200 - 100) + c == .3

output:

[1] FALSE

Why does reordering changes the output?

This problem has nothing to do with algebraic order of operations, as both left hand expressions will give the same result theoretically.

I suspect it might be compilation optimization that is causing the differences in results.

In the first equation, both c will just cancel out without actually doing the arithmetic inside the (e - c) part.

Whereas in the second equation the arithmetic inside (e - c) has to be computed, thus the computer has to compute .3 - .1, which leads to an imprecision error.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • 1
    @markus: The OPs problem does not appear to be related to floating point arithmetic. It is, as RLave writes, raelated to order of operations/operator precedence. – AkselA Dec 20 '18 at 13:41
  • "I suspect it might be compilation optimization that is causing the differences in results." I doubt that. Order of operations matters in floating point arithmetic even if equivalent mathematically. – Roland Dec 20 '18 at 13:59
  • To be clear, I think you need to discuss both floating-point arithmetic and order of operations to explain this behaviour, and I can find no suitable dupe target. – AkselA Dec 20 '18 at 14:25
  • @AkselA I voted to reopen so that all aspects can be considered. Best – markus Dec 20 '18 at 14:32
  • The problem is that `e` and `c` are most likely not exactly equal to `0.3` and `0.1` respectively, which becomes apparent if you display enough decimals. – ira Dec 20 '18 at 16:04
  • Just for fun, you can also do: `.1 + .2 == .3` – ira Dec 20 '18 at 16:14

2 Answers2

3

Following @AkselA's example a little farther, removing the +c for simplicity:

r1 <- a / b * (e - c)
r2 <- (e - c) * a / b
r3 <- (e - c) * (a / b)
options(digits=22)
r1
## [1] 0.1999999999999999833467
r2
## [1] 0.2000000000000000111022
r3
## [1] 0.1999999999999999833467

We could simplify this still farther to d <- e-c; a/b*d == d*a/b and get the same results.

The results depend on whether the division by b is done before multiplying by (e-c) (r1, r3) or after (r2). Since floating point arithmetic is commutative but is not associative (see Wikipedia, or any of the links in this answer), we can see that r1 and r3 should indeed be identical (a/b and e-c are evaluated, then multiplied), not necessarily the same as r2 (e-c is multiplied by a, then (e-c)*a is divided by b).

  • these differences are not about compiler optimization as suggested by the OP (R is an interpreted language, it doesn't optimize the execution of arithmetic expressions) [also, the cancellation suggested by the OP doesn't actually work ...]
  • they're not about integer vs floating point operations (str(a) or storage.mode(a) show that a is a floating-point number, not an integer; use 100L if you want an integer)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Floating point arithmetic is not associative. Floating point arithmetic is not associative. Floating point arithmetic is not associative. Floating point arithmetic is not associative. Floating point arithmetic is not associative. ... – AkselA Dec 20 '18 at 18:38
1

I honestly do not know what's going on. Of course the float thing is obvious, but I fail to see why reordering the equation like this should change things. I've experimented using a slightly simplified version of the equation, changing values and moving things around.

eq.1 and eq.2 are the same as the OPs, eq.3 is equivalent to eq.1 thanks to operator preference (see ?Syntax). * and / are evaluated before + and -, other than that they are evaluated from left to right, and parenthesis are resolved from inner to outer. So in eq.1 and eq.3 the order is - / * +, in eq.2 the order is - * / +.

a <- 100
b <- 100
c <- 0.1
e <- 0.3

r1 <- a / b * (e - c) + c    # [1]

r2 <- (e - c) * a / b + c    # [2]

r3 <- (e - c) * (a / b) + c  # [3]

r1 == r2  # FALSE

r1 == r3  # TRUE

sprintf("%.20f", c(r1, r2, r3))
# "0.29999999999999998890" "0.30000000000000004441" "0.29999999999999998890"

It's clear that the exact value depends on the order of operations, although in purely arithmetic terms it shouldn't matter (they're associative). But the ordering only makes a difference when you're using some specific values. If you set a and b to 10 say, or set c to 0.2, the values are identical. My hunch is that this is caused by division being done in integer mode in eq.1, and float mode in eq.2, the second introducing rounding errors the first one doesn't (contingent on starting values). The real take-home message is of course what markus linked to, you need to take extra care when comparing floats, but it would still be nice to have a definite explanation of this precise behaviour.

AkselA
  • 8,153
  • 2
  • 21
  • 34