-1

I have a quite complicated model, which is written in several R scripts.The final function

model <- LSV_funct(s,e,h,matrix1,matrix2,vector1,A1,A2,B1,B2)

parameters s,e,h are constants, matrix1,matrix2,vector1 are inputs values and A1,A2,B1,B2 need to be optimized, besed on the best correlation with empirical data. I therefore need to optimize these four parameters, and I am doing this simply with four nested loops. The code is below:

s =  524;   e =  684;  h   =  40
results <- c()
A1s <- seq(1, 10, 0.5) 
A2s <- seq(10, 20, 0.5)  
B1s <- seq(0.01, 0.04, 0.001)  
B2s <- seq(0.5, 0.9, 0.1)   

for (i in 1:length(A1s)){
  for (j in 1:length(A2s)){
    for (y in 1:length(B1s)){
      for (k in 1:length(B2s)){

  A1 <- A1s[i]; A2 <- A2s[j]; B1 <- B1s[y];  B2 <- B2s[k]

matrix1     =  as.matrix(read.csv("inputs1.csv",header=FALSE))   #read in matrix1
matrix2     =  as.matrix(read.csv("inputs2.csv",header=FALSE)) #read in matrix2
vector1 <- c(0.3, 0.08, 0.045, 48.25, 9.32, 54, 85, 6, 15, 1250)   

source(file="mypath/LSV_fun.R", chdir=T) #call the R script where the full model is written
model <- LSV_funct(s,e,h,matrix1,matrix2,vector1,A1,A2,B1,B2) #run the funtion

out_model <- (model[[1]]) #save one model output for comparing with an empirical dataset
r <- cor(out_model, empirical_dataset) #calculate correlation between modeled and observed dataset
comb <- cbind(r, A1, A2, B1, B2) #save correlation value and parameters combination
results <- rbind(results,comb)
  }
   }
     }
       }
combin <- as.data.frame(results)  #save everything in a dataframe
names(combin) <- c("corr", "A1", "A2", "B1", "B2")

best <- subset(combin, combin$corr == max(combin$corr)) #and finally save the combination of parameters that give the best correlation
print(best)

The problem is that this system is not time efficient at all.It takes up to several hours to run the optimization and save the best set of parameters.

Is there a smarter function for doing the same operation but more efficiently? I had a look at the optim() function, but I encountered difficulties to apply it for my purpose and my model (I do not have much experience in optimization algorythms...)

Thanks in advance for any suggestion!

refroll
  • 131
  • 1
  • 9
  • I'm having problems understanding why you read your data inside the loops. Wouldn't it be sufficient (and much more efficient) to do that once? Also, it seems strange to maximize the correlation. Usually one would minimize the sum of squared residuals. – Roland Jan 05 '15 at 16:26
  • Please study the concept of a [minimal reproducible example](http://stackoverflow.com/a/5963610/1412059). – Roland Jan 05 '15 at 16:28
  • Thanks for the tip. True, I moved the data reading outside the loops. – refroll Jan 06 '15 at 08:23
  • I did not insert the complete function for the sake of simplicity, its really long and complicated. And my question was about the optimization function, not how to optimize the function itself – refroll Jan 06 '15 at 08:25
  • We don't need your function. We need *a* (representative) function, which allows us to test solutions. Otherwise you can get only some general remarks at best. – Roland Jan 06 '15 at 08:47

1 Answers1

0

From your description, it seems that LSV_funct takes most of the time -- the nested loops themselves are most certainly not the problem here. (Double-check by removing the body of the innermost loop.)

There are several ways to tackle the problem:

  1. Improve run time of LSV_funct
  2. Distribute computation over several cores/processors/machines
  3. Evaluate only a (cleverly chosen) subset of the parameters.

Take a look at the BatchJobs package for the second, and at the BatchExperiments package (especially the function makeDesign) for the third option. (The latter package uses the former package as backend.)

I can't find enough detail in your question to help with the first option.

krlmlr
  • 25,056
  • 14
  • 120
  • 217
  • Thanks anyway for the suggestions. Sorry for not posting the full function, I did not write it myself and my intention is not to modify it. – refroll Jan 06 '15 at 08:27
  • @refroll: The other options work without modifying the full function. – krlmlr Jan 06 '15 at 11:07