1

I'm trying to write an R package which contains several nested functions, within a foreach statement and doMPI backend. It is throwing a "cannot find "XXX" object error. The strange thing is that this error does not occur if I use doParallel as the backend. This is an example of the problem but I could use a working solution, with doMPI for much bigger problems.

This is the code that has been compiled into the R-Package using RStudio, roxygen, devtools etc.

#' Test function level 1
#' @param var11 first variable for function 1
#' @param var12 second variable for function 1
#' @param var13 third variable for function 1
#' @export fun1

fun1 <- function (fun2.params, fun3.params, var11, var12, var13, ...) {

    results <- data.frame (foreach::`%dopar%`(
               foreach::`%:%`(foreach::foreach(j = 1:var11, .combine = cbind),
               foreach::foreach (i = 1:var12, .combine=rbind)),
               {
                   out3 <- replicate(var13,
                                     do.call(fun2,
                                             c(list(fun3.params=fun3.params),
                                               fun2.params)))
                   output2 <- data.frame(mean(out3))
        }
    )
)
    ## save outputs for subsequent analyses if required
saveRDS(results, file = paste("./outputs/", var13 ,"_", var12, "_", var11, "_",
                              format(Sys.time(), "%d_%m_%Y"), ".rds", sep=""))
}

#' Test function level 2
#' @param var21 first variable for function 2
#' @param var22 second variable for function 2
#' @export fun2

fun2 <- function (fun3.params, var21, var22, ...) {
    out2 <- `if` (rpois(1, var21) > 0, var22 * do.call(fun3, fun3.params), 0)
}

#' Test function level 3
#' @param var31 first variable for function 3
#' @param var32 second variable for function 3
#' @param var33 third variable for function 3
#' @export fun3

fun3 <- function (var31, var32, var33, ...) {
    out3 <- var31 * rnorm(1, mean=var32, sd= var33)
}

I then load the library and call the top level function from an .R file using emacs ESS (or from RStudio editor) and these commands

library(toymod)
library(doParallel)
cl <-makeCluster(10)
registerDoParallel(cl)

fun1.params <- list(var11=10, var12=150, var13=365)
fun2.params <- list(var21=0.05,var22=9.876)
fun3.params <- list(var31=1.396,var32=14.387,var33=3.219)

do.call(fun1, c(list(fun2.params = fun2.params,
                     fun3.params = fun3.params),
                fun1.params))

When I run it using doParallel as the parallel backend it works fine, however when I run it using doMPI, I get the following error

Error in { : task 12 failed - "object 'fun2' not found"

This is running on Ubuntu 16.04 Linux, using R 3.4.1, doMPI 0.2.2, and doParallel. I've put the whole package on github at https://github.com/jamaas/toymod.git

Could someone tell me if I need to change the code for doMPI? It seems to be related to producing the R package.

Jim Maas
  • 1,481
  • 2
  • 16
  • 36
  • It's a guess but this might be because `doParallel` will clone your parent R environment into *X* workers, but `doMPI` probably doesn't do this. You will need to export your functions from your parent environment to each worker, `cl`, or embed them within the `{}` of `foreach`. – CPak Sep 05 '17 at 22:28
  • I agree that this is odd, since doMPI should behave the same as doParallel with the snow-derived interface (which is what you're doing). I suspect an enhancement/bug fix was made to doParallel which I never made to doMPI. I'm starting to look into this, but it may take some time. – Steve Weston Sep 06 '17 at 13:31

1 Answers1

1

I believe the problem is that you need to use the foreach .packages='toymod' option. This is because the body of the foreach loop isn't actually part of the 'toymod' package, and therefore you need to load 'toymod' like you would to access functions from any other R package.

I don't know why this isn't necessary when using doParallel. I guess doParallel must automatically load the package that the foreach loop is in. I'll look into this some more, and perhaps modify doMPI to do the same.

Steve Weston
  • 19,197
  • 4
  • 59
  • 75
  • That works! Thx. It does seem to me a bit strange, don't know the correct technical term but 'circular' (Ouroboros) or 'incestuous' spring to mind. Its not logical to me to have to load a function within itself, but perhaps this is normal in some programming paradigms? – Jim Maas Sep 07 '17 at 09:27
  • @JimMaas Setting up the environment for an R expression to be evaluated in remote processes is tricky. You'd really like the body of the foreach loop to be able to access non-exported functions from the package in which it is defined, but I don't think that is possible in these circumstances. I think it does make sense to have the package automatically loaded, but that still treats the foreach body as a second class citizen of the package. – Steve Weston Sep 07 '17 at 14:31