1

The a and b as shown below are the same quantities but are calculated in two different ways in R. They are mostly the same but with several big differences. I could not figure out why that was the case.

theta0 <- c(-0.4, 10)

OS.mean <- function(shape, rank, n=100){
  term1 <- factorial(n)/(factorial(rank-1)*factorial(n-rank))
  term2 <- beta(n-rank+1, rank) - beta(n-rank+shape+1, rank)
  term1*term2/shape
}

OS.mean.theta0.100 <- OS.mean(theta0[1], rank=seq(1, 100, by=1))

Bias.MOP <- function(shape, scale, alpha){
  scale*shape*OS.mean.theta0.100[alpha*100]/(1-(1-alpha)^shape) - scale
}

a <- rep(0, 98)
for(i in 2:99){
  a[i-1] <- Bias.MOP(theta0[1], theta0[2], i/100)
}
plot(a)

b <- Bias.MOP(theta0[1], theta0[2], seq(0.02, 0.99, by=0.01))
plot(b)

a-b

One other strange thing is as follows.

b[13] # -0.8185083
Bias.MOP(theta0[1], theta0[2], 0.14) # -0.03333929

They are supposed to be the same. But they clearly are not. Why?

LaTeXFan
  • 1,136
  • 4
  • 14
  • 36
  • 1
    Check `((2:99)/100) - seq(0.02, 0.99, by=0.01)`. –  Jul 29 '15 at 01:52
  • @Pascal They are quite close with difference e-17. But you will see some entries of `a-b` are 0.3. Is that normal? By the way, which method is better? – LaTeXFan Jul 29 '15 at 01:56
  • @BondedDust Which section? – LaTeXFan Jul 29 '15 at 01:57
  • 1
    Miscellaneous: why aren't these numbers equal – IRTFM Jul 29 '15 at 01:59
  • 1
    OP is not commenting on the small differences of order e-16 etc. they are commenting on those few points that are different on order e-1: indices 13, 28, 56, 57. – mathematical.coffee Jul 29 '15 at 02:00
  • @mathematical.coffee Yes, exactly. Do you know why, please? Moreover, why are `b[13]` and `Bias.MOP(theta0[1], theta0[2], 0.14)` so different? – LaTeXFan Jul 29 '15 at 02:02
  • @mathematical.coffee Change `alpha*100` to `round(alpha * 100)` and it solves the problem. Answer forthcoming – David Robinson Jul 29 '15 at 02:05
  • @DavidRobinson Yes, it did. But what went wrong, please? In particular, why are `b[13]` and `Bias.MOP(theta0[1], theta0[2], 0.14)` so different, please? They are indeed the same procedure. – LaTeXFan Jul 29 '15 at 02:08
  • Odd - `seq(.02, .99, by=.01)[13]*100) - 14` is on the order `e-15`. But `as.integer(seq(...)) - 14` is `-1`. Your index `alpha*100` is evaluating to 14 for `.14/100`, but to 13 for the `seq(...)`. – mathematical.coffee Jul 29 '15 at 02:09
  • @DavidRobinson By the way, what will happen if I write `x[1.01]` where `x` is a vector, but `1.01` is not an integer? – LaTeXFan Jul 29 '15 at 02:10
  • 1
    @20824 you know, you can try all these things out yourself.. – rawr Jul 29 '15 at 02:10
  • Just to be clear... This was yet another instance of not understanding the faq about numeric values. With the additional bit of knowledge that indices are truncated down. – IRTFM Aug 05 '15 at 01:02

1 Answers1

5

The problem is you are indexing by a numeric, alpha*100 in this line:

OS.mean.theta0.100[alpha*100]

When floating point error causes seq(0.02, 0.99, by=0.01) to be even slightly less than the corresponding integer in 2:99, you end up extracting the wrong number from theta0.100. For example, see:

x <- 1:10
x[5]
# [1] 5
x[6]
# [1] 6
x[5.99999999]
# [1] 5

A quick solution is to change alpha*100 to round(alpha*100), as below, to ensure that you're always selecting by the nearest round number.

Bias.MOP <- function(shape, scale, alpha){
  scale*shape*OS.mean.theta0.100[round(alpha*100)]/(1-(1-alpha)^shape) - scale
}
Community
  • 1
  • 1
David Robinson
  • 77,383
  • 16
  • 167
  • 187
  • Thanks. But still without modifying my function, why are `b[13]` and `Bias.MOP(theta0[1], theta0[2], 0.14)` so different, please? They are supposed to be correct or wrong together. Right? – LaTeXFan Jul 29 '15 at 02:14
  • @20824 Because of [floating point error](http://stackoverflow.com/questions/6874867/floating-point-issue-in-r). Try `.14 - seq(0.02, 0.99, by=0.01)[13]`. You would expect it to be 0, but it is not. Thus, `OS.mean.theta0.100[alpha*100]` will be chosen differently between those two alphas. – David Robinson Jul 29 '15 at 02:16
  • Yes. I see. Thanks again. Does other software deal with this kind of things better than R. Such as Python or something else? – LaTeXFan Jul 29 '15 at 02:18
  • 1
    @20824 *No.* Floating point error is [universal](http://floating-point-gui.de/) [across](http://stackoverflow.com/questions/21895756/why-are-floating-point-numbers-inaccurate) [languages](http://csharpindepth.com/Articles/General/FloatingPoint.aspx). One of the ways to handle it (besides being aware of it) is not to use the result of floating point calculations directly in indexing, or in equalities – David Robinson Jul 29 '15 at 02:22
  • 1
    @20824 For example, you could have organized your function as `scale*shape*OS.mean.theta0.100[alpha]/(1-(1-alpha / 100)^shape) - scale` instead, then pass it `2:99`. That would work perfectly. – David Robinson Jul 29 '15 at 02:23