0

I'm trying to run a simulation with parameter values drawn from different distributions for many times. And I would like to save the results from each run by column to different matrices. This should be pretty common but I have a hard time parallelising my original scripts with nested loops. I wonder if I could find some help here.

And here is my try on parallelising:

library(foreach)
library(doParallel)
library(raster)

cl <- makeCluster(5)
registerDoParallel(cl)

ccomb <- function(...) {
  args <- list(...)
  lapply(seq_along(args[[1]]), function(i)
    do.call('cbind', lapply(args, function(a) a[[i]])))}

sim.results<-foreach(i=1:n.iter, .combine='ccomb', .multicombine=TRUE,.packages =c("raster")) %dopar% {

for(k in 1:n.year) {
    temp.s1<-matrix.s1[,k]*matrix.pro[,k]

    temp.raster<-raster(nrow=10, ncol=10,xmn=0, xmx=10,ymn=0, ymx=10,crs=NA,resolution=c(1,1))
    temp.raster[]<-temp.s1

    temp.raster.mv<-focal(temp.raster,w=matrix.mv,fun=sum,na.rm=T)

    post.mv<-as.vector(temp.raster.mv)

    matrix.s1[,k+1]<-post.mv+matrix.s2[,k]
    matrix.s2[,k+1]<-matrix.s2[,k]*matrix.pro[,k]}

    matrix.s1_year2<-matrix.s1[,2]
    matrix.s1_year3<-matrix.s1[,3]
    matrix.s1_year4<-matrix.s1[,4]

    matrix.s2_year2<-matrix.s2[,2]
    matrix.s2_year3<-matrix.s2[,3]
    matrix.s2_year4<-matrix.s2[,4]

 list(matrix.s1_year2,matrix.s1_year3,matrix.s1_year4,matrix.s2_year2,matrix.s2_year3,matrix.s2_year4)} 
 stopCluster(cl)

The inner for(k in 1:n.year) loop is sequential (computation for each year depends on the result from previous year) and contains a moving window function, I therefore assumed it has to stay as a whole. The outer foreach loop is how many times I would like to run the inner loops, and I would like to save all the runs in the six matrices at the end of the outer loop.

However, when I run above code, I found it used only one core and wasn't sped up. If I understand this post right, putting for loops within foreach loops should be ok.

I wonder if anyone could give me a hand on finding where the problem might be? Thanks very much!

UPDATE

Here is sample data:

n.year<-3  
n.iter<-5  

matrix.s1<-matrix(NA,100,n.year+1,dimnames= list(NULL,c("year1","year2","year3","year4")))
matrix.s2<-matrix(NA,100,n.year+1,dimnames= list(NULL,c("year1","year2","year3","year4")))
matrix.pro<-matrix(rbeta(300,1,1),100,n.year,dimnames= list(NULL,c("year1","year2","year3")))
matrix.mv<-matrix(rbeta(25,1,1),5,5)
matrix.s1_year2<-matrix.s1_year3<-matrix.s1_year4<-matrix(NA,100,n.iter)
matrix.s2_year2<-matrix.s2_year3<-matrix.s2_year4<-matrix(NA,100,n.iter)

matrix.s1[,1]<-runif(100,0,10)
matrix.s2[,1]<-runif(100,10,20)  
CYH
  • 43
  • 5
  • 3
    You're asking a lot for somebody to take on all of this to help you. Is it possible to reduce the problem to something that doesn't involve a dozen or two variables and dozens of lines of code? – r2evans Apr 09 '18 at 21:08
  • Ah sorry about that, I just thought it might make the question clearer if I provide an example. I guess I should just take the examples out. Hope it's clearer now. – CYH Apr 09 '18 at 21:41
  • 1
    This is a good reduction, thanks. And yes, "example" is good, but "concise and sufficient" is typically better, especially when a good portion of answerers are doing this on breaks between work projects (i.e., "distracted"). Okay, perhaps I'm just describing myself. – r2evans Apr 09 '18 at 22:16
  • You define `i` in the outer `foreach`, but none of your data is actually subdivided, so each parallel process will be doing the same thing. Is there some stochastic component of this where you are intending to repeat the block `n.iter` times? How is it you know that it used only one core? – r2evans Apr 09 '18 at 23:06
  • Please provide some sample data to make your code actually reproducible. – F. Privé Apr 10 '18 at 05:40
  • 1
    Hi F. Privé, I have added the sample data back in. Hope it help showing why I tried to keep the whole `for` loops within `foreach` loops. I saw some posts discussing nested `foreach` loops but it will seem to meet problem when the inner loop is sequential and contains a moving window function. – CYH Apr 10 '18 at 14:12
  • 1
    Hi @r2evans. I didn't notice that I miss `i` in the outer `foreach` loop, I have added the missed `i` back to the post. I use `i` in the same way as how I write nested `for` loops. I wonder whether the mislocated `i` was the reason why the code didn't run in parallel successfully. – CYH Apr 10 '18 at 14:20
  • Regarding how I know it used only one core - I checked task manager and saw several front-ends were created but only one was used (CUP used is not 0%). I assumed it means that it was still using just one core. – CYH Apr 10 '18 at 14:27
  • I'd like to self-answer the question as the reason why the above codes failed to run in parallel was surprising to me. It was just because, on a windows machine, if there is already a user running something in parallel, other scripts cannot be run parallelly, even if there are still cores and memory available, and there will be no error or warning message shown. This probably happens often on shared high performance computers. So I'd just like to leave a note here just in case someone meets the same issue. – CYH Apr 11 '18 at 18:06

0 Answers0