2

I’m using the acceptance-rejection method for beta distribution with g(x) = 1, 0 ≤ x ≤ 1. The function is: f(x) = 100x^3(1-x)^2.

I want to create an algorithm to generate data from this density function.

How can I estimate P(0 ≤ X ≤ 0.8) with k = 1000 repetitions (n=1000)? How can I solve this in R?

I already have:

beta.rejection <- function(f, c, g, n) {
naccepts <- 0
result.sample <- rep(NA, n)

  while (naccepts < n) {
    y <- runif(1, 0, 0.8)
    u <- runif(1, 0, 0.8)

    if ( u <= f(y) / (c*g(y)) ) {
      naccepts <- naccepts + 1
      result.sample[n.accepts] = y
    }
  }

  result.sample
}

f <- function(x) 100*(x^3)*(1-x)^2
g <- function(x) x/x
c <- 2

result <- beta.rejection(f, c, g, 10000)

for (i in 1:1000){ # k = 1000 reps
  result[i] <- result[i] / n
}

print(mean(result))
Mark
  • 23
  • 1
  • 4

1 Answers1

2

You are close, but a few problems:

1) typo with naccepts vs. n.accepts

2) If you aren't going to hard-wire in g then you shouldn't hard-wire in runif() as the function which generates random variables which are distributed according to g. The function rejection (why hard-wire in beta?) should also be passed a function which is capable of generating the appropriate random variables.

3) u should be drawn from [0,1] not [0,0.8]. The 0.8 should have no part in the generation of the values, just their interpretation.

4) c need to be an upper bound for f(y)/g(y). 2 is too small. Why not take the derivative of f to find it's max? 3.5 works. Also -- c isn't a good name for a variable in R (because of the function c()). Why not call it M?

Making these changes yields:

rejection <- function(f, M, g, rg,n) {
  naccepts <- 0
  result.sample <- rep(NA, n)

  while (naccepts < n) {
    y <- rg(1)
    u <- runif(1)

    if ( u <= f(y) / (M*g(y)) ) {
      naccepts <- naccepts + 1
      result.sample[naccepts] = y
    }
  }

  result.sample
}

f <- function(x) 100*(x^3)*(1-x)^2
g <- function(x) 1
rg <- runif
M <- 3.5 #upper bound on f (via basic calculus)

result <- rejection(f, M, g,rg, 10000)

print(length(result[result <= 0.8])/10000)

Typical output: 0.9016

You can also make a density histogram of result and compare it to the theoretical beta distribution:

> hist(result,freq = FALSE)
> points(seq(0,1,0.01),dbeta(seq(0,1,0.01),4,3),type = "l")

The match is quite good:

enter image description here

John Coleman
  • 51,337
  • 7
  • 54
  • 119