Consider the first example: It calculates mean within the loop.
st <- Sys.time() #Starting Time
set.seed(123456789)
vara <- c()
sda <- c()
mvara <- c() #store mean
msda <- c() #store mean of standard deviation
K <- 100000
for(i in 1:K) {
a <- rnorm(30)
vara[i] <- var(a)
sda[i] <- sd(a)
mvara[i] <- mean(mvara)
msda[i] <- mean(msda)
}
et <- Sys.time()
et-st #time taken by code (approx more than one minute)
Consider the same code, except that the same mean is calculated outside the loop.
st <- Sys.time() #Starting Time
set.seed(123456789)
vara <- c()
sda <- c()
K <- 100000
for(i in 1:K) {
a <- rnorm(30)
vara[i] <- var(a)
sda[i] <- sd(a)
}
mvara <- cumsum(vara)/ (1:K)
msd <- cumsum(sda)/ (1:K)
et <- Sys.time() #less than 5 seconds
I just wanted to know, why there is so much difference in performance of both the codes? Where one should take care when using loop?