1

This a follow-up to Parallelize function taking external pointers (XPtr)

I won't reproduce the Cpp code here to make things shorter. The problem was to that once a parameter of a function is evaluated, it is defined in the function's environment and, in the case of an external pointer, is no longer available in a fork cluster.

So while this function worked:

require(parallel)
test1 <- function(a) {
  cl <- makeForkCluster(nnodes=2)
  r <- parLapply(cl, 1:5, function(i) g(a,i) )
  stopCluster(cl)
  unlist(r)
}

This function didn't:

test2 <- function(a) {
  cl <- makeForkCluster(nnodes=2)
  p <- g(a, 0)
  r <- parLapply(cl, 1:5, function(i) g(a,i) )
  stopCluster(cl)
  unlist(r)
}

As pointed out by Ralf Stubner, this came from the fact that the call g(a, 0) forced the evaluation of the promise a. He suggested the following work around (here with two debug prints to understand how it works):

test3 <- function(a) {
  cl <- makeForkCluster(nnodes = 2)
    print(pryr::promise_info(a))
  b <- eval(substitute(a))
  p <- g(b, 0)
    print(pryr::promise_info(a))
  r <- parLapply(cl, 1:5, function(i) g(a,i) )
  stopCluster(cl)
  unlist(r)
}

This allowed to access to whatever was in a, but a was still an unevaluated promise. But this doesn't work when test3 is called from another function!

test4 <- function(b) test1(b)
test5 <- function(b) test3(b)

While test4 works well (the recursive promise evaluation seems to give a valid pointer), the workaround in test3 no longer works when called from test5.

The debugging prints show that despite the eval(substitute(a)) trick, the promise is evaluated. My understanding is that this tricks forces the evaluation of the promise b of test5 in its environment, thus a becomes evaluated too.

Is there another workaround? (I tried to play with pryr::parent_promise but even the code from the example in the man page gives strange results).

I have other complex problems of this type. A general way to get the content of a promise without evaluating it, or to pass external pointers to other functions, with a parLapply call at the very end, without stumbling constantly on this problem, would be much welcome.

Elvis
  • 548
  • 2
  • 14
  • 3
    I have no idea how to solve this. Maybe it is best to rethink your approach to parallelization. We don't know enough about your real use-case to offer any advice on this, though. – Ralf Stubner Feb 19 '19 at 17:49
  • @ralf-stubner I currently have satisfying workarounds. I was just trying to improve them / understand things better. Thank you again for your previous answer and for having a look at this question. – Elvis Feb 19 '19 at 19:41
  • 1
    I am with @Ralf here and mostly shake my head: _most if not all_ solutions to parallel programming _setup_ issues deal with making _very_ certain the code to run is on the node. Putzing around with promises does not strike me as helping here but your mileage may, as they say, vary. – Dirk Eddelbuettel Feb 19 '19 at 22:45
  • @DirkEddelbuettel I usually agree that it is a terrible idea to bypass "rules" that are here for code security... However in that case I have a hard time to understand why this rules are here -- and if these are indeed rules or merely undesirable side-effects. As long as the xptr has been created before the fork cluster, it should be valid on all nodes. It *is* valid on all nodes, as proved by the fact that `test1` and `test4` work perfectly well. – Elvis Feb 23 '19 at 04:36
  • @DirkEddelbuettel I don't get why it is a good thing that `test2` doesn't work; the fact that the workaround in `test3` works well make me think that it might not be a good thing at all. – Elvis Feb 23 '19 at 04:38
  • @DirkEddelbuettel Of course my understanding is limited. I surely should have asked a question on the mechanisms of external pointers validity check instead -- why and when do they become invalid? – Elvis Feb 23 '19 at 04:43
  • We have been using parallel computing approaches with R since at least the early 2000s when packages like [snow](https://cran.r-project.org/package=snow) started to make it easy(-ish). You will find 15 years of discussion of these scoping rules in the various list archives, and numerous tutorials and discussions in many places. I invite you to study those carefully, or just trust current best-of-breed approaches as _e.g._ in the [future](https://cran.r-project.org/package=future) package. – Dirk Eddelbuettel Feb 23 '19 at 13:33
  • Thanks a lot Dirk, I'll try to take the time to have a look! – Elvis Feb 23 '19 at 13:39
  • 1
    This is an interesting question from the perspective of how R works. But I suspect you would have to dig very deep to determine the exact reasons. Also note that `mclapply` works. – thc Jan 15 '21 at 19:34

0 Answers0