Using a while loop for the code below finishes in about 20 seconds. Using foreach and %dopar% without clusters is about 25 seconds and with clusters about 28 seconds.
I'm looking for clarification since I've read here on stackoverflow that small task can be slower with parallel-processing, but when I increase the +/- for the numbers in iproduct the parallel-processing is still slower.
Is this because;
- I'm using iproduct and instead should use a different iterator or
- The amount of data being iterated by iproduct needs to be much much bigger for parallelizing makes sense or
- The amount of computation in the while loop is a small task so parallelizing won't ever make it faster.
Any help on getting my code to run faster would be great.
My final data won't be huge since I am only keep what makes it through the conditional if statement, but I want to iterate over more numbers than what I currently have for p1&2 and r0-2.
Here is the while loop code
start_time <- Sys.time()
p1<-2
p2<-2
r0<-25
r1<-4
r2<-0
TB<-c()
iter_count <- ihasNext(iproduct(ani1=(p1-2):(p1+2), ani2=(p2-2):(p2+2),
fd0=(r0-6):(r0+6), fd1=(r1-4):(r1+6), fd2=(r2):(r2+6)))
while( hasNext(iter_count) ) {
ne <- nextElem(iter_count)
aniprev <- sum(ne$ani1,ne$ani2)
SRFD <- sum(ne$fd1,(2*ne$fd2))
totalSRFD <- sum(ne$fd0,ne$fd1,ne$fd2)
manhattan_dist<-sum(abs(p1-ne$ani1),abs(p2-ne$ani2),
abs(r0-ne$fd0),abs(r1-ne$fd1),abs(r2-ne$fd2))
if(manhattan_dist <=5 & aniprev == SRFD & totalSRFD == 29)
{ani<-cbind(ne$ani1, ne$ani2, ne$fd0,
ne$fd1, ne$fd2, manhattan_dist)
TB=rbind(TB,ani)}
}
nodup2 <- TB[!duplicated(t(apply(TB, 1, sort))), ]
end_time <- Sys.time()
end_time - start_time
And here is the foreach and %dopar% parallel equivalent
start_time <- Sys.time()
p1<-2
p2<-2
r0<-25
r1<-4
r2<-0
iter_count <- iproduct(ani1=(p1-2):(p1+2), ani2=(p2-2):(p2+2),
fd0=(r0-6):(r0+6), fd1=(r1-4):(r1+6), fd2=(r2):(r2+6))
dc <- detectCores()-1
registerDoParallel(dc)
res <- foreach(i=iter_count, .combine=rbind) %dopar% {
aniprev <- sum(i$ani1,i$ani2)
SRFD <- sum(i$fd1,(2*i$fd2))
totalSRFD <- sum(i$fd0,i$fd1,i$fd2)
manhattan_dist<-sum(abs(p1-i$ani1),abs(p2-i$ani2),
abs(r0-i$fd0),abs(r1-i$fd1),abs(r2-i$fd2))
if(manhattan_dist <=5 & aniprev == SRFD & totalSRFD == 29)
{ani<-cbind(i$ani1, i$ani2, i$fd0,
i$fd1, i$fd2, manhattan_dist)}
}
nodup2 <- res[!duplicated(t(apply(res, 1, sort))), ]
end_time <- Sys.time()
end_time - start_time