I have some numerical measurements on two groups of people and I would like to compare means between these two groups. Just using a t-test for that purpose which gives me a confidence interval and p-value. Now, I'd like do a bootstrap analysis on this data to get a feel for the variability of both the CI and p-values.
I'm using R and the boot
package. The data is stored in dataframe data
. To calculate the statistics I have this function:
calculate <- function(formula, data, indices) {
d <- data[indices,]
m <- t.test(formula, data=d)
return(c(m$conf.int, m$p.value))
}
Then I run the bootstrap as follows:
results <- boot(data=data, statistic=calculate, R=1000, formula=y ~ x)
Then I plot the p-values in "results" as follows:
hist(results$t[,3], breaks=32)
The histogram looks as shown below. I understand that the distribution of p-values is skewed because the p-value is constrained to be no smaller than zero. But I don't understand why the peak of the distribution is at zero as well, no matter how many breaks I display in the histogram.