1

I converted my serial code to parallel, but I don't know why parallel code takes longer than the sequential code. In sequential it takes only few seconds while in parallel it takes about 8 to 9 minutes. How can I improve my code?

I was able to create a similar problem as original one. Here is the code. The data structure is same as in my original problem and can't change (I believe).

Parallel Code:

rm(list=ls())
library(foreach); library(doParallel)

CORES<-16
c1<-makeCluster(CORES)
registerDoParallel(c1)

CASES<-64
preV<-list()
for (i in 1:CASES) preV[[i]]<-c(110, 1340, 4560, 230, 10)

#mimicking original problem
#===========================================
out<-list()
for (i in 1:CASES) out[[i]]<-list()
out1<-matrix(0,nrow=131041,ncol=6)
for (i in 1:CASES){ 
  out1[,]<-runif(6*131041,1,100)
  out[[i]]<-rbind(out[[i]],out1)
}
#===========================================

Y1<-Y2<-Y3<-Y4<-Y5<-c(0)
TOL <- 1E-3

time1<-Sys.time()
status<-foreach (j = 1:CASES, .combine='rbind', .inorder=FALSE) %dopar% {
#status<-list(); for (i in 1:CASES) status[[i]]<-list()
#for (j in 1:CASES)  {
  Y1[j] <- as.numeric(out[[j]][length(out[[j]][,1]),2])
  Y2[j] <- as.numeric(out[[j]][length(out[[j]][,1]),3])
  Y3[j] <- as.numeric(out[[j]][length(out[[j]][,1]),4])
  Y4[j] <- as.numeric(out[[j]][length(out[[j]][,1]),5])
  Y5[j] <- as.numeric(out[[j]][length(out[[j]][,1]),6])

  v1<-abs(Y1[j]-preV[[j]][1])
  v2<-abs(Y2[j]-preV[[j]][2])
  v3<-abs(Y3[j]-preV[[j]][3])
  v4<-abs(Y4[j]-preV[[j]][4])
  v5<-abs(Y5[j]-preV[[j]][5])


  if (v1<TOL & v2<TOL & v3<TOL & v4<TOL & v5<TOL)
    st<-1
  else
    st<-0

  #status[[j]]<-c(Y1[j],Y2[j],Y3[j],Y4[j],Y5[j],st)
  return (c(j,Sys.getpid(),Y1[j],Y2[j],Y3[j],Y4[j],Y5[j],st))
}# End of j-Loop

time2<-Sys.time()

stopCluster(c1)
print(difftime(time2,time1))

To convert parallel code to serial:

uncomment following three 3 lines

#status<-list(); for (i in 1:CASES) status[[i]]<-list()
#for (j in 1:CASES)  {
..............
..............
#status[[j]]<-c(Y1[j],Y2[j],Y3[j],Y4[j],Y5[j],st)

and comment out the following 2 lines:

status<-foreach (j = 1:CASES, .combine='rbind', .inorder=FALSE) %dopar% {
..................
..................
return (c(j,Sys.getpid(),Y1[j],Y2[j],Y3[j],Y4[j],Y5[j],st))
bell
  • 191
  • 1
  • 1
  • 11
  • 1
    "How can I improve my code?" suggests that this question might be a better fit on [codereview.se] – John Coleman Aug 14 '17 at 15:08
  • Possible duplicate of [Why is the parallel package slower than just using apply?](https://stackoverflow.com/questions/14614306/why-is-the-parallel-package-slower-than-just-using-apply) – Ralf Stubner Aug 14 '17 at 15:22
  • What do you want to achieve? Parallelizing the "mimicking original problem"? – F. Privé Aug 14 '17 at 20:53
  • I am wondering, why there is a big difference of time between the two? I understand there will be some overhand but not expecting that much difference. Parallel code takes 8-9 minutes while series code takes 2-3 sec. – bell Aug 14 '17 at 23:01

0 Answers0