Evaluating the resulted simulated data

Question

I am simulating data using the Rejection method where the density function of X is given by f(x)= C * e^(x) for all x in [0,1]. I defined g(x) = 1 and C to be the maximum of f(x) which is equal to 1/(e-1).

I used the following code to simulate data:

rejection <- function(f, C, g, rg, n) {
  naccepts <- 0
  result.sample <- rep(NA, n)

  while (naccepts < n) {
    y <- rg(1)
    u <- runif(1)

    if ( u <= f(y) / (C*g(y)) ) {
      naccepts <- naccepts + 1
      result.sample[naccepts] = y
    }
  }

  result.sample
}

f <- function(x) ifelse(x>=0 & x<=1, (exp(x)), 0)
g <- function(x) 1
rg <- runif
C <-  1/(exp(1) -1)

result <- rejection(f, C, g,rg, 1000)

Then, I use the histogram to compare the simulated data with the curve of original pdf as

hist(result,freq = FALSE)
curve(f, 0, 1, add=TRUE)

But the resulted plot is kind of weired! the plot is here so I am looking for any help to clarify what is wrong in my work.

It is correct, but you need to extend the y axis. `f` on (0,1) goes from 1 to e - the left end of it appears on the chart. Of course that might not be what you are intending to plot! — Andrew Gustar, Feb 21 '20 at 14:37
@AndrewGustar but why the sampled data don't match the curve? and as you see, when using the modification in the answer below, there i a huge difference between the curve and the histogram — adam, Feb 21 '20 at 14:58
Are you dividing by `C` when you should be multiplying, in the `if` statement in your loop? — Andrew Gustar, Feb 21 '20 at 15:06

score 0 · Answer 1 · answered Feb 21 '20 at 14:49

0

This shows the whole curve and histogram:

curve(f, 0, 1)
hist(result,freq = FALSE, add=TRUE)

But of course now the histogram is somewhat small in the plot...

answered Feb 21 '20 at 14:49

dario

6,415
2
12
26

thanks, or there is something wrong on my code and I don't know what it is – adam Feb 21 '20 at 14:56
This is the normal behaviour of base `plot`. The first call to `plot` defines the size (i.e. the zoom level) of the plot. By drawing the plot with the wider range of values first we make sure everything fits – dario Feb 21 '20 at 15:04
Yes I understand this part but I am wondering why there is no matching between the histogram and the curve – adam Feb 21 '20 at 15:06
ah, ok.. yea to me it does not look like that curve has an integral of 1. – dario Feb 21 '20 at 15:08

Severin Pappadeux · Answer 2 · 2020-02-23T19:55:53.843

Your maximum value is wrong. For exp(x) in the range [0...1] interval, normalized PDF would be

f(x) = exp(x)/(exp(1) - 1)

Thus maximum of f(x) is exp(1)/(exp(1) - 1)

UPDATE

Ok, here is code which follows wiki article on Rejection sampling

sampleRej <- function(f, g, rg, M, n) { # rejection sampling following wiki
    naccepts <- 0
    result   <- rep(NA, n)

    while (naccepts < n) {
        y <- rg(1)
        u <- runif(1)

        if ( u <= f(y) / (M*g(y)) ) {
            naccepts <- naccepts + 1
            result[naccepts] = y
        }
    }

    result
}

g <- function(x) { # Normalized uniform distribution PDF
    1.0
}

f <- function(x) { # normalized sampling PDF
    exp(x)/(exp(1.0) - 1.0)
}

rg <- function(n) { # function to sample from g, which is uniform
    runif(n)
}


M <- exp(1.0)/(exp(1.0) - 1.0) # f(x) maximum value
q <- sampleRej(f, g, rg, M, 100000) # sample 100K points

# and plot everything together
curve(f, 0.0, 1.0)
hist(q, freq=FALSE, add=TRUE)

and it produces graph like one below, looks good to me

Yes you are right, but what to consider g(x) when considering C as you showed? — adam, Feb 22 '20 at 13:46
@adam your code does not follow rules of the rejection sampling - your PDFs were not normalized properly. — Severin Pappadeux, Feb 23 '20 at 19:56

Evaluating the resulted simulated data

2 Answers2