1

I'd like to ask the experts what may be wrong with the code below:

resultOptimization <- foreach(i = 1:length(varietyNames) , .combine = rbind, .init = resultOptimization) %dopar% {
    variety <- varietyNames[i]


  subsetVarietyNames <<- subset(observedValues, Variety == variety)
  rowNumberOfObservedValues <<- nrow(subsetVarietyNames)

  inputTrait <<- subsetVarietyNames[,c("Site", "Latitude", "Longitude", "Altitude", "Flooding", 
                                       "Transplanting", "DDTransplantingShock", "Z", "LAIMax", 
                                       "POP", "PLAini", "LRGRMax", "SowingDate")]



  resultOptimTrait <- genoud(fn = minimizationFunction, nvars = numberOfParametersToBeEstimated, max = FALSE,
                             Domains = optimizedParametersBounds, pop.size = 2, max.generations = 2, 
                             wait.generations = 2, hard.generation.limit = TRUE, MemoryMatrix = TRUE, 
                             starting.values = NULL, default.domains = 10, solution.tolerance = 0.001, 
                             gr = NULL, boundary.enforcement = 0, lexical = FALSE, gradient.check = TRUE, 
                             BFGS = FALSE, data.type.int = FALSE, hessian = TRUE, unif.seed = 812821, 
                             int.seed = 53058, extra_arg = fMeteo)


  estimatedParameters <- resultOptimTrait$par
  sumOfErrors <- resultOptimTrait$value

  #In case of abrupt system failure, save preliminary results 
  capture.output(c(variety, estimatedParameters, sumOfErrors),file = "PRE_optim_values.txt",append = (if(i==1) {FALSE} else{ TRUE}))

  resultLine <- c(variety, estimatedParameters, sumOfErrors)

}

What I'm doing :

I'm trying to optimize parameters for several varieties of plants, the first loop should get the optimized params for varieties 1 - N, and the second loop would print out simulated values using the optimized params. The code above is my take on making it parallelized for faster processing, with the first for-loop taking in as many as Varieties at a time as there are cores on the server.

Expected output :

Optimized parameters stored in the list "resultOptimization"

What I get :

resultOptimization[1] = V1
resultOptimization[2] = V1
resultOptimization[3] = V1

and so on, when it should be

resultOptimization[1] = V1
resultOptimization[2] = V2
resultOptimization[3] = V3

The problem is this :

When I run this loop in sequential, it produces the correct outcome --optimized parameters of each variety (using rgenoud), but when I parallelize it like in the code block above, the output seems to be only the first variety repeated over and over;

This is strange as I'm 100% sure that the correct values are loaded before each genoud() call and the sequential version of the program works correctly. Anyone familiar with this problem?

Additionally, when I said sequential, I meant 1 core, the problem comes up when I use >1 core , both versions I have used the code block above

  • 1
    This is a very specific question which is hard to answer without a reproducible example. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Feb 05 '13 at 07:46
  • I don't know how exactly to put in the snippets that would make it work without copying in a lot of code -- I thought this would be a general question when one sees the same results as the first one over and over when using dopar but that does not seem to be the case, anyway, should I put up the code here (<800 lines) or put it up elsewhere and leave a link here? thanks! – Richard Pasco Feb 05 '13 at 08:07
  • The best thing you could do is narrow down your question. One other option I can see is to make an elaborate report of what you're doing and how you're doing things, what results you expect and what you actually get. If licence permits, you could publish to RPubs. – Roman Luštrik Feb 05 '13 at 12:29

0 Answers0