0

I use R (3.1) and plyr and doMC as parallel backend (as far as I know that's the only parallel backend that works with plyr)

My question is how can I write to global variables from within the working threads. I have this (very artificial) example:

library(doMC)
library(plyr)
registerDoMC(cores=2)

result1 = data.frame(id=c(1:3), a=NA)
result2 = data.frame(id=c(1:3), b=NA)

f = function(x){
    result1[ result1$id==x$id, "a"] <<- x$a
    result2[ result2$id==x$id, "b"] <<- x$b
}

data = data.frame(id=c(1:3), a=c(4:6), b=c(7:9))
a_ply(data, .margins=1, .fun=f, .parallel=T)

As I want to fill 2 dataframes I can't use aaply or adply. The example does what it is supposed to do with parallel=FALSE. When I do it parallel the result data.frames remain empty. I know that I have to export the global variables to the workers and i tried to use .paropts=list(.export=c("result1", "result2")) but that doesn't help....

Does anybody know how to export global variables to doMC workers? Or is there another solution which fills both data.frames in a parallel environment (mazbe without plyr?)

Jonas
  • 1,639
  • 1
  • 18
  • 29
  • OT: `library(doParallel);registerDoParallel(2)` will work with plyr too. And every other parallel backend (doMPI, doSNOW). – Marek Sep 09 '14 at 10:18
  • no...`doMC` exports all the parameters plyr needs. See @hadley comment on this question: http://stackoverflow.com/questions/5559287/how-do-i-make-dosmp-play-nicely-with-plyr – Jonas Sep 09 '14 at 10:35
  • It was three years ago ;) – Marek Sep 09 '14 at 11:17
  • hmmm...however. doParallel does not work on my Ubuntu 64 bit system. – Jonas Sep 09 '14 at 11:29

1 Answers1

1

You can't modify global object from parallel workers. That's why you need export your data.frames - workers can't access global environment and they change only exported copy.

You need to rewrite your function to return some kind of value, which later you can use to fill data.frame.

Marek
  • 49,472
  • 15
  • 99
  • 121