-1

I am doing a simulation study and I wrote the following R code. Is there anyway to write this code without using two for loop, or make it more efficient (run faster)?

S = 10000
n = 100
v = c(5,10,50,100)
beta0.mle = matrix(NA,S,length(v))  #creating 4 S by n NA matrix 
beta1.mle = matrix(NA,S,length(v))
beta0.lse = matrix(NA,S,length(v))
beta1.lse = matrix(NA,S,length(v))
for (j in 1:length(v)){
  for (i in 1:S){
    set.seed(i)
    beta0 = 50
    beta1 = 10
    x = rnorm(n)
    e.t = rt(n,v[j])
    y.t = e.t + beta0 + beta1*x
    func1 = function(betas){
      beta0 = betas[1]
      beta1 = betas[2]
      sum = sum(log(1+1/v[j]*(y.t-beta0-beta1*x)^2))
      return((v[j]+1)/2*sum)
    }
    beta0.mle[i,j] = nlm(func1,c(1,1),iterlim = 1000)$estimate[1]
    beta1.mle[i,j] = nlm(func1,c(1,1),iterlim = 1000)$estimate[2]
    beta0.lse[i,j] = lm(y.t~x)$coef[1]
    beta1.lse[i,j] = lm(y.t~x)$coef[2]
  }
}

The function func1 inside the second for loop is used for nlm function (to find mle when errors are t distributed). I wanted to use parallel package in R but I didn't find any useful functions.

Tylar
  • 13
  • 1
  • Can you give a description of what you're trying to achieve? i.e., what the loops **should** be doing, and what you expect the output to be? – SymbolixAU Nov 04 '16 at 00:52
  • Not at all clear to me why you define func1 inside loop, rather than pass j, x, to it as parameters – dww Nov 04 '16 at 01:53
  • `lineprof` might be helpful in finding the steps that are slowest. – mt1022 Nov 04 '16 at 03:06

1 Answers1

2

The key to getting anything to run faster in R is replacing for loops with vectorized functions (such as the apply family). Additionally, as for any programming language, you should look for places where you are calling expensive functions (such as nlm) more than once with the same parameters and see where you can store the results rather than recomputing each time.

Here I am starting as you did by defining the parameters. Also since beta0 and beta1 always 50 and 10 I am going to define those here as well.

S <- 10000
n <- 100
v <- c(5,10,50,100)
beta0 <- 50
beta1 <- 10

Next we will define func1 outside the loop to avoid redefining it each time. func1 now has two extra parameters, v and y.t so that it can be called with the new values.

func1 <- function(betas, v, y.t, x){
  beta0 <- betas[1]
  beta1 <- betas[2]
  sum <- sum(log(1+1/v*(y.t-beta0-beta1*x)^2))
  return((v+1)/2*sum)
}

Now we actually do the real work. Rather than having nested loops, we use nested apply statements. The outer lapply will make a list for each value of v and the inner vapply will make a matrix for the four values you want to get (beta0.mle, beta1.mle, beta0.sle, beta1.lse) for each value of S.

values <- lapply(v, function(j) vapply(1:S, function(s) {
  # This should look familiar, it is taken from your code
  set.seed(s)
  x <- rnorm(n)
  e.t <- rt(n,j)
  y.t <- e.t + beta0 + beta1*x
  # Rather than running `nlm` and `lm` twice, we run it once and store the results
  nlmmod <- nlm(func1,c(1,1), j, y.t, x, iterlim = 1000)
  lmmod <- lm(y.t~x)
  # now we return the four values of interest
  c(beta0.mle = nlmmod$estimate[1],
    beta1.mle = nlmmod$estimate[2],
    beta0.lse = lmmod$coef[1],
    beta1.lse = lmmod$coef[2])
}, numeric(4)) # this tells `vapply` what to expect out of the function
)

Finally we can reorganize everything into the four matrices.

beta0.mle <- vapply(values, function(x) x["beta0.mle", ], numeric(S))
beta1.mle <- vapply(values, function(x) x["beta1.mle", ], numeric(S))
beta0.lse <- vapply(values, function(x) x["beta0.lse.(Intercept)", ], numeric(S))
beta1.lse <- vapply(values, function(x) x["beta1.lse.x", ], numeric(S))

As a final note, it may be possible to reorganize this to run even faster depending on why you are using the S index to set the seed. If it is important to know what seed was used to generate your x with rnorm then this may be there best I can do. However if you are only doing it to ensure that all of your values of v are being tested on the same values of x then there may be more reorganizing we can do that may produce more speed up using replicate.

Barker
  • 2,074
  • 2
  • 17
  • 31
  • `apply` is loop-hiding, not vectorization. While a small speed-up can be had by moving from `for` to `apply`, it should not be confused with the speed-up that moves from `for` or `apply` to true vectorization---which is usually much greater. See, e.g., [Is R's apply family syntactic sugar?](http://stackoverflow.com/a/2276001/903061) or [Circle 4 of the R Inferno](http://www.burns-stat.com/pages/Tutor/R_inferno.pdf). – Gregor Thomas Nov 04 '16 at 17:17
  • You have a point about `for` loop hiding and vectorization, but I feel this is where the definition vectorization gets fuzzy. If you call a vectorized function one that accepts and is optimized for a vector of inputs, it qualifies. If you call it one that evaluates that whole vector simultaneously, it doesn't. The SO post you quote is full of answers showing 3-10x speed ups for switching from `for`s to `apply`s. The big difference with `apply`s is that the looping and variable creation happens in `C` rather than `R` which can be much faster. – Barker Nov 04 '16 at 17:50
  • It's true, but the dramatic speed-ups sometimes seen in switching to apply tend to be in cases where what's happening inside the loop is trivial (see John's answer). In most practical cases, as in this case, the loop body does something non-trivial. You've made some very good efficiency improvements in this answer in the way you've restructured the code. I just think you're giving the impression that the majority of the improvements are due to using `vapply`, whereas that's really just the icing on the cake. – Gregor Thomas Nov 04 '16 at 18:16
  • You are correct that a lot of the speed up has nothing to do with the `apply`s (ie. creating variables once, calling `nlm` and `lm` once per loop), and since the function called is complicated, most of the time is happening in the loop rather than outside. However, since there are a lot of iterations, time spent looping and storing values does add up, though I haven't bench-marked the two options so I can't say for sure how much. If it was hard to engineer, the speed up might not be worth the effort, but it is easy. – Barker Nov 04 '16 at 18:26
  • @Gregor Thanks for the catch on the missing `x` parameter. I just fixed it. – Barker Nov 04 '16 at 18:31