0

when I'm simulating clinical trial data to compare the performance of bootstrap CI with wald CI, I find my simulation takes hours. Here is the structure of my code

rep = bootstrap = 1000
effectsize = 0.6  
estimate = numeric(bootstrap )
coverage = logic(rep)
for(i in 1:rep){
    df =  SimulateOneTrial(...) # this is a func that can yield one virtual clinical trial dataset
    for (j in 1:bootstrap){
       index = sample(1:nrow(df),nrow(df), TRUE)
       estimate[j] = analysis(df[index,]) # some func that can perform the analysis and produce an estimate
}
bootstrap.CI = quantile(estimate,0.025,0.975)
coverage[i] = (bootstrap.CI[1]<=effectsize && bootstrap.CI[1]>=effectsize)
}
sum(coverage )/rep

I'm wondering if there is a way to speed it up? thx in advance!

  • 1
    for example, instead of calling your simulate function 1000 times for 50 patients, call it once for 50000 patients – rawr Dec 11 '21 at 04:35
  • 1
    You're not making any obvious performance-tanking errors like growing objects. `SimulateOneTrial()` and `analysis()` likely take up 99% of the compute time each iteration. I you're going to run `SimulateOneTrial()` 1,000 times, and `analysis()` 1,000,000 times, focus on speeding up those functions, especially `analysis()`. And the way to know how to do that is [to profile your code](https://stackoverflow.com/questions/3650862/how-to-efficiently-use-rprof-in-r). And you can look at running things in parallel. – Gregor Thomas Dec 11 '21 at 04:38

0 Answers0