0

(a) Generate 1000 samples where each consists of 50 independent exponential random variables with mean 1. Estimate the mean of each sample. Draw a histogram of the means. (b) Perform a KS test on each sample against the null hypothesis that they are from an exponential random variable with a mean that matches the mean of the data set. Draw a histogram of the 1000 values of D.

i did part a with this code

set.seed(0)
simdata = rexp(50000, 1)
matrixdata = matrix(simdata,nrow=50,ncol=1000)
means.exp = apply(matrixdata,2,mean)
means.exp
hist(means.exp)

but im stuck on part (b)

Peter Tan
  • 9
  • 2

1 Answers1

1

You can use lapply on the column indices:

# KS test on every column
# H0: pexp(rate = 1/mean(column))
lst.ks <- lapply(1:ncol(matrixdata), function(i)
    ks.test(matrixdata[, i], "pexp", 1.0/means.exp[i]))

Or directly without having to rely on means.exp:

lst.ks <- lapply(1:ncol(matrixdata), function(i)
    ks.test(matrixdata[, i], "pexp", 1.0/mean(matrixdata[, i])))

Here 1.0/means.exp[i] corresponds to the rate of the exponential distribution.

PS. Using means.exp = colMeans(matrixdata) is faster than apply(matrixdata, 2, mean), see e.g. here for a relevant SO post.


To extract the test statistic and store it in a vector simply sapply over the KS test results:

# Extract test statistic as vector
Dstat <- sapply(lst.ks, function(x) x$statistic);

# (gg)plot Dstat
ggplot(data.frame(D = Dstat), aes(D)) + geom_histogram(bins = 30);

enter image description here

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68