0

Consider the following:

probs <- c(0.05, rep(0.95/99, 99))

which clearly sums to 1 according to

sum(probs).

However, when I type

sum(probs == 1)

I get 0 (i.e, boolean FALSE).

Why does this discrepancy occur? Shouldn't the two commands be equal?

As a test, I compared these with all.equal():

all.equal(sum(probs), sum(probs == 1))
[1] "Mean relative difference: 1"

all.equal(sum(probs == 1), sum(probs))
[1] "Mean absolute difference: 1",

which seems to suggest inequality, but why?

My guess would be numerical precision handling in R (i.e., Machine epsilon) is not stringent enough.

Any thoughts?

compbiostats
  • 909
  • 7
  • 22
  • 1
    You're comparing 0 and 1 in your `all.equal` comparisons. `sum(probs)` is `1`, while `sum(probs==1)` is `0`. `all.equal(sum(probs),1)` on the other hand is `TRUE` – thelatemail Sep 21 '18 at 01:47
  • 1
    you mean to do `sum(probs) == 1` ? Read this out. [Why are these numbers not equal?](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal) – Ronak Shah Sep 21 '18 at 02:03
  • @thelatemail The reason ask is because I am performing a simulation where 100 objects are selected from according to the distribution of probs. Would there then be a easy way to ameliorate the 'discrepancy' (I am not able to run the simulation otherwise), perhaps by increasing the Machine epsilon in R? – compbiostats Sep 21 '18 at 02:04
  • @JarrettPhillips - my point is you're not even comparing the right things. Why would you expect 0 and 1 to be equal given any standard `.Machine$double.eps` ? `sum(probs)` is indeed `1`. `sum(probs==1)` is `0` because `probs==1` returns 100 values of `FALSE`. Summing 100 `FALSE` / `0` values is still `0`. Notice the slight difference in the placement of the `()` in the code suggested by Ronak. – thelatemail Sep 21 '18 at 03:07
  • @thelatemail I just noticed the change Ronak made to their earlier comment. Anyway, I see the point now - thanks. – compbiostats Sep 21 '18 at 13:22

0 Answers0