4

Let's say I want to use foreach in the doParallel package to return a list of two data frames of different dimensions like the following:

a<-NULL
b<-NULL
for(i in 1:100){
  a<-rbind(a,data.frame(input=i,output=i/2))
  if(i > 5){
    b<-rbind(b,data.frame(input=i,output=i^2))
  }
}
list(a,b)

Sinceforeachreturns an object, there's no (at least to me) obvious way to do the above with foreach.

NOTE: this is a much simplified version of the problem I'm actually working with so solving the problem by using lapply (or something along those lines) won't work. The spirit of my question is how to do this with foreach.

NewNameStat
  • 2,474
  • 1
  • 19
  • 26

2 Answers2

8

I figured it out. You have to define your own function that combines the lists in exactly the way you want.

#takes an arbitrary number of lists x all of which much have the same structure    
comb <- function(x, ...) {  
      mapply(rbind,x,...,SIMPLIFY=FALSE)
}

foreach(i=1:10, .combine='comb') %dopar% {
      a<-rbind(a,data.frame(input=i,output=i/2))
      if(i > 5){
        b<-rbind(b,data.frame(input=i,output=i^2))
      }
      list(a,b)
}
NewNameStat
  • 2,474
  • 1
  • 19
  • 26
  • 2
    I like your combine function, but I think you need remove the rbind calls from the body of the foreach loop. Also, if you use the foreach `.multicombine=TRUE` option, it will be more efficient since `comb` will be called once rather than 9 times in your example. – Steve Weston Dec 05 '14 at 03:06
  • How can we alter the `comb` function to return only `unique` values? – Carrol May 08 '18 at 14:44
0

Adding a data.table rbindlist version to NewNameStat's answer:

#takes an arbitrary number of lists x all of which much have the same structure    
comb <- function(x, ...) {  
      mapply(rbind,x,...,SIMPLIFY=FALSE)
}

foreach(i=1:10, 
        .combine=function(x,...) mapply(function(...) data.table::rbindlist(list(...), fill = TRUE),x,...,SIMPLIFY=FALSE)) 
      %dopar% {
      a<-rbindlist(list(a,data.table(input=i,output=i/2)))
      if(i > 5){
        b<-rbindlist(list(b,data.table(input=i,output=i^2)))
      }
      list(a,b)
}