Nested parallel processing with conditional logic error

Question

This one is a bit complicated, so I don't think it would be worthwhile to share the exact code I am working with, but I should be able to get the point across fairly well using pseudocode:

Little bit of background: Essentially I am trying to do parallel computing on a nested loop of operations. I have two large functions, the first one needs to run and return TRUE in order for the second function to run, and if the second function runs it needs to loop through several iterations. Now this is a nested loop because I need to run the entire above operation several times, for various scenarios. The pseudocode I am trying to use is below:

Output <- foreach(1 to “I”, .packages=packages, .combine=rbind) %:%  
    Run the first function  
    If the first function is false:  
        Print and record  
    Else:  
        Foreach(1 to J, .packages=packages, .combine=rbind) %dopar%{  
            Run the second function  
            Create df summarizing each loop of second function  
        }

Here is a simplified version of what I am trying to do and the error I am running into:

library(doParallel)
library(foreach)
func1 <- function(int1){
  results <- list(int1,TRUE)
  return(results)
}
func2 <- function(int2){
  return(int1/int2)
}

int1list <- seq(1,10)
int2list <- seq(1,15)

out <- foreach(i=1:length(int1list),.combine=rbind) %:%
  out1 <- func1(i)
  if(out1[[2]]==FALSE){
    print("fail")
    next
  } else{
    foreach(j=1:length(int2),.combine=rbind) %dopar% {
      int3 <- func2(j)
      data.frame("Scenario"=i,"Result"=int3)
    }
  }

Error: Error in func1(i) : object 'i' not found

When I run the above, it essentially tells me that it can’t even find the object “I”, which I assume is happening because I am running things that call “I” outside of the innermost loop. I have been able to get nested parallelized loops to work before, but I did not have anything that needed to run outside of the innermost loop, so I am assuming it is an issue with the package not knowing the order to perform things in.

I have a workaround where I can just run the first function in parallel and then run the second function in parallel based on the results of the first loop (essentially two separate loops instead of a nested loop), but I was wondering if there was a way to get something like the nested loop to work because I think it would be more efficient. When run in production this code will likely take hours to run, so saving some time would be worthwhile.

where that `conditional logic *ERROR*` part coming into picture? Also parallelization inside already parllelized code will most likely ending up slowing up whole code (due to split and merge operations becoming very costly). — abhiieor, Nov 12 '18 at 17:07
The error is coming into play when the first function is ran, since "i" is part of the function call. — actuary_meets_data, Nov 12 '18 at 17:10
The pseudocode may not be enough, and it's hard to address an R error when we don't have R code. I suspect this pseudocode is based heavily on actual code, so I suggest: come up with two *trivial* 1-2 line functions (in place of your more complex funcs) and a reproducible question including where `I` would be coming from. If this is based on subsetting a large dataset of some sort, well, it might help to give a sample (similarly structured or a sample from the actual data) as well. — r2evans, Nov 12 '18 at 17:11
What would you recommend as the most algorithmically efficient way to perform the above? The first function needs to run and succeed before the second function can run, the inner loop will likely need to loop about 15-25 times. The outer loop will likely be looping anywhere between 10 and 500 times. — actuary_meets_data, Nov 12 '18 at 17:12
That is a good recommednation r2evans, I will edit with an update. — actuary_meets_data, Nov 12 '18 at 17:12

r2evans · Answer 1 · 2018-11-12T23:01:24.400

I'm not a pro at foreach, but there are a few things to this that stand out:

func2 references both int1 and int2 but it is only given the latter; this might be an artifact of your simplified example, maybe not?

your code here needs to be enclosed in a curly block, i.e., you need to change from

out <- foreach(i=1:length(int1list),.combine=rbind) %:%
  out1 <- func1(i)
  if(out1[[2]]==FALSE) ...

to

out <- foreach(i=1:length(int1list),.combine=rbind) %:% {
  out1 <- func1(i)
  if(out1[[2]]==FALSE) ...
}

the docs for foreach suggest that the binary operator %:% is a nesting operator that is used between two foreach calls, but you aren't doing that. I think I get it to work correctly with %do% (or %dopar%)
I don't think prints work well inside parallel foreach loops ... it might work find on the master node but not on all others, ref: How can I print when using %dopar%
possibly again due to simplified example, you define but don't actually use the contents of int1list (just its length), I'll remedy in this example
next works in "normal" R loops, not in these specialized foreach loops; it isn't a problem, though, since your if/else structure provides the same effect

Here's your example, modified slightly to account for all of the above. I add UsedJ to indicate

library(doParallel)
library(foreach)
func1 <- function(int1){
  results <- list(int1,int1>2)
  return(results)
}
func2 <- function(int1,int2){
  return(int1/int2)
}
int1list <- seq(1,3)
int2list <- seq(1,5)
out <- foreach(i=1:length(int1list),.combine=rbind) %do% {
  out1 <- func1(int1list[i])
  if(!out1[[2]]){
    data.frame("Scenario"=i, "Result"=out1[[1]], UsedJ=FALSE)
    # next
  } else{
    foreach(j=1:length(int2list),.combine=rbind) %dopar% {
      int3 <- func2(out1[[1]], int2list[j])
      data.frame("Scenario"=i,"Result"=int3, UsedJ=TRUE)
    }
  }
}
out
#   Scenario Result UsedJ
# 1        1   1.00 FALSE
# 2        2   2.00 FALSE
# 3        3   3.00  TRUE
# 4        3   1.50  TRUE
# 5        3   1.00  TRUE
# 6        3   0.75  TRUE
# 7        3   0.60  TRUE

Edit

If you aren't seeing parallelization, perhaps it's because you have not set up a "cluster" yet. There are also a few other changes to the work flow to get it to parallelize well, based on foreach's method of nesting loops with the %:% operator.

In order to "prove" this is working in parallel, I've added some logging based on How can I print when using %dopar% (because parallel processes do not print as one might hope).

library(doParallel)
library(foreach)
Log <- function(text, ..., .port = 4000, .sock = make.socket(port=.port)) {
  msg <- sprintf(paste0(as.character(Sys.time()), ": ", text, "\n"), ...)
  write.socket(.sock, msg)
  close.socket(.sock)
}
func1 <- function(int1) {
  Log(paste("func1", int1))
  Sys.sleep(5)
  results <- list(int1, int1 > 2)
  return(results)
}
func2 <- function(int1, int2) {
  Log(paste("func2", int1, int2))
  Sys.sleep(1)
  return(int1 / int2)
}

The use of the logging code requires an external way to read from that socket. I'm using netcat (nc or Nmap's ncat) with ncat -k -l 4000 here. It is certainly not required for the job to work, but is handy here to see how things are progressing. (Note: this listener/server needs to be running before you try to use Log.)

I couldn't get the nested "foreach -> func1 -> foreach -> func2" to parallelize func2 correctly. Based on the sleeps, this should take 5 seconds for the three calls to func1, and 2 seconds (two batches of three each) for the five calls to func2, but it takes 10 seconds (three parallel calls to func1, then five sequential calls to func2):

system.time(
  out <- foreach(i=1:length(int1list), .combine=rbind, .packages="foreach") %dopar% {
    out1 <- func1(int1list[i])
    if (!out1[[2]]) {
      data.frame(Scenario=i, Result=out1[[1]], UsedJ=FALSE)
    } else {
      foreach(j=1:length(int2list), .combine=rbind) %dopar% {
        int3 <- func2(out1[[1]], int2list[j])
        data.frame(Scenario=i, Result=int3, UsedJ=TRUE)
      }
    }
  }
)
#    user  system elapsed 
#    0.02    0.00   10.09

with the respective console output:

2018-11-12 11:51:17: func1 2
2018-11-12 11:51:17: func1 1
2018-11-12 11:51:17: func1 3
2018-11-12 11:51:23: func2 3 1
2018-11-12 11:51:24: func2 3 2
2018-11-12 11:51:25: func2 3 3
2018-11-12 11:51:26: func2 3 4
2018-11-12 11:51:27: func2 3 5

(note that the order is not guaranteed.)

So we can break it out into computing func1 stuff first:

system.time(
  out1 <- foreach(i = seq_along(int1list)) %dopar% {
    func1(int1list[i])
  }
)
#    user  system elapsed 
#    0.02    0.01    5.03 
str(out1)
# List of 3
#  $ :List of 2
#   ..$ : int 1
#   ..$ : logi FALSE
#  $ :List of 2
#   ..$ : int 2
#   ..$ : logi FALSE
#  $ :List of 2
#   ..$ : int 3
#   ..$ : logi TRUE

console:

2018-11-12 11:53:21: func1 2
2018-11-12 11:53:21: func1 1
2018-11-12 11:53:21: func1 3

then work on func2 stuff:

system.time(
  out2 <- foreach(i = seq_along(int1list), .combine="rbind") %:%
    foreach(j = seq_along(int2list), .combine="rbind") %dopar% {
      Log(paste("preparing", i, j))
      if (out1[[i]][[2]]) {
        int3 <- func2(out1[[i]][[1]], j)
        data.frame(i=i, j=j, Result=int3, UsedJ=FALSE)
      } else if (j == 1L) {
        data.frame(i=i, j=NA_integer_, Result=out1[[i]][[1]], UsedJ=FALSE)
      }
    }
)
#    user  system elapsed 
#    0.03    0.00    2.05 
out2
#   i  j Result UsedJ
# 1 1 NA   1.00 FALSE
# 2 2 NA   2.00 FALSE
# 3 3  1   3.00 FALSE
# 4 3  2   1.50 FALSE
# 5 3  3   1.00 FALSE
# 6 3  4   0.75 FALSE
# 7 3  5   0.60 FALSE

Two seconds (first batch of three is 1 second, second batch of two is 1 second) is what I expected. Console:

2018-11-12 11:54:01: preparing 1 2
2018-11-12 11:54:01: preparing 1 3
2018-11-12 11:54:01: preparing 1 1
2018-11-12 11:54:01: preparing 1 4
2018-11-12 11:54:01: preparing 1 5
2018-11-12 11:54:01: preparing 2 1
2018-11-12 11:54:01: preparing 2 2
2018-11-12 11:54:01: preparing 2 3
2018-11-12 11:54:01: preparing 2 4
2018-11-12 11:54:01: preparing 2 5
2018-11-12 11:54:01: preparing 3 1
2018-11-12 11:54:01: preparing 3 2
2018-11-12 11:54:01: func2 3 1
2018-11-12 11:54:01: preparing 3 3
2018-11-12 11:54:01: func2 3 2
2018-11-12 11:54:01: func2 3 3
2018-11-12 11:54:02: preparing 3 4
2018-11-12 11:54:02: preparing 3 5
2018-11-12 11:54:02: func2 3 4
2018-11-12 11:54:02: func2 3 5

You can see that func2 is called five times correctly. Unfortunately, you see that there is a lot of "spinning" internally in the loop. Granted, it's effectively a no-op (as evidenced by the 2.05 second runtime) so the load on the nodes is negligible.

If somebody has a method to preclude this needless spinning, I welcome comments or "competing" answers.

Yes - apologies for the sloppy sample, all of your assumptions were appropriate. Your code works and fixes the issue I was having, but upon adapting this and running it, it is not parallelizing the work in the way I had intended. I was intending for it to try to utilize all 8 processors on my server in order to get the job done, but it appears this nested strategy results in the algorithm doing func1 followed by func2 a repeated number of times before moving onto the second iteration of i (which makes sense now that I think about it). Doing 2 separate loops should be more efficient for this. — actuary_meets_data, Nov 12 '18 at 18:39
See my edit ... it's verbose as all get out with code you won't need, but I think it's clear what you can discard and what you may want to adapt into your own code. — r2evans, Nov 12 '18 at 19:57
I wasn't able to get your code to run properly. I added a parallelization with 4 cores for testing (cl <- makeCluster(4) / registerDoParallel(cl)), and I am getting an error: Error in make.socket(port = .port) : socket not established. This seems to be related to the log function you wrote, and possibly the 4000 port number? I am extremely unfamiliar with this so I am not sure. — actuary_meets_data, Nov 12 '18 at 22:25
I am working on a version similar to the second piece you posted (because I agree, the first version is going to do exactly what it is doing, and that is not ideal). However, I am only using one loop for func2, as there is some data that needs to be pulled in for func2 to work based off of the results of func1. I'm trying to determine whether this can be done in a nested parallel loop or if I should stick to my current single parallel loop for func2 within a sequential func1 loop. — actuary_meets_data, Nov 12 '18 at 22:29
For your error, I should have mentioned here (and it is mentioned on the link provided) that the netcat listener (e.g., `ncat -k -l 4000`) needs to be started *first*. If that's the only problem, none of that code is required for production, just for explanation of performance and parallelism. — r2evans, Nov 12 '18 at 23:00
As for your second comment *"there is some data that needs to be pulled in for `func2` to work based off of the results of `func1`"*, I thought that my second solution is still doing precisely that: by referencing `out1[[i]][[1]]` in the inner loop, it is using the results from the calls to `func1`. — r2evans, Nov 12 '18 at 23:03

score 0 · Answer 2 · answered Nov 13 '18 at 20:00

I appreciate the help provided by r2evans, while I wasn't actually able to replicate his work due to my inexperience and inability to figure out how to get ncat working on my computer, he helped me realize that my original method wouldn't work as well as splitting into two separate foreach parallelized loops, which I have gotten to a working production version at this point in time.

This is the original proposed solution:

library(doParallel)
library(foreach)

cl <- makeCluster(detectCores())
registerDoParallel(cl)

func1 <- function(int1){
  results <- list(int1,int1>2)
  return(results)
}
func2 <- function(int1,int2){
  return(int1/int2)
}
int1list <- seq(1,3)
int2list <- seq(1,5)
out <- foreach(i=1:length(int1list),.combine=rbind) %do% {
  out1 <- func1(int1list[i])
  if(!out1[[2]]){
    data.frame("Scenario"=i, "Result"=out1[[1]], UsedJ=FALSE)
    # next
  } else{
    foreach(j=1:length(int2list),.combine=rbind) %dopar% {
      int3 <- func2(out1[[1]], int2list[j])
      data.frame("Scenario"=i,"Result"=int3, UsedJ=TRUE)
    }
  }
}

stopCluster(cl)
registerDoSEQ()

out

However, this results in a loop that waits for the first iteration of func1's func2 iterations to complete before beginning the second and on iterations of func1. I elected to split this into two separate loops, like below:

library(doParallel)
library(foreach)

cl <- makeCluster(detectCores())
registerDoParallel(cl)

func1 <- function(int1){
  results <- list(int1,int1>2)
  return(results)
}
func2 <- function(int1,int2){
  return(int1/int2)
}
int1list <- seq(1,3)
int2list <- seq(1,5)

out1 <- foreach(i=1:length(int1list)) %dopar%{
  func1(i)
}

finalOut <- data.frame("Scenario"=integer(),"UsedJ"=logical(),"Result"=double())

for (i in 1:length(int1list)){
  if(out1[[2]]==FALSE){
    tempOut <- data.frame("Scenario"=i,"UsedJ"=FALSE,"Result"=NA)
  } else{
    tempOutput <- foreach(j=1:length(int2list),.combine=rbind) %dopar% {
      Result <- func2(i,j)
      data.frame("Scenario"=i,"UsedJ"=TRUE,"Result"=Result)
    }
  }
}

stopCluster(cl)
registerDoSEQ()

finalOut

This algorithm seems to fit my purposes nicely. It isn't as efficient as it could be, but it should get the job done and not be too wasteful.

For clarity ... `ncat` (or `nc`) needed to be run in a terminal, not in R ... I apologize if that was apparent, but newer users may not have made that leap based on my vague description. Regardless, the use of it was solely to provide indications of function-entry, it was by no means necessary for the parallelization strategy to work. — r2evans, Nov 14 '18 at 16:40
I was trying to run it in terminal, but I either ran it in the terminal that came with the installation (where it immediately closed out) or from a clean terminal, where it did not recognize the command. Would I have needed to add this to the system path? I do not have administrative rights on my work computer to add programs to the system path (which is pretty annoying). — actuary_meets_data, Nov 14 '18 at 16:48
Admin rights are not required, but I understand the frustration. I do not understand what you mean *"ran it in the terminal that came with the installation"* ... installation of netcat? It doesn't include a terminal. "Clean terminal"? If on windows, then "Start > Run > `cmd`", go to the right dir, type in `ncat -k -l 4000`, it should "do nothing" and not return. If on linux, open an xterm, type in `path/to/ncat -k -l 4000`, should "do nothing" (and not return). Regardless, does the parallelization technique (without `Log`) work as intended. We can stop this thread, no need to force netcat. :-) — r2evans, Nov 14 '18 at 16:57

Nested parallel processing with conditional logic error

2 Answers2