0

Taking an example XPtr function:

test.cpp

#include <Rcpp.h>

// [[Rcpp::export]]
SEXP funx()
{
    /* creating a pointer to a vector<int> */
    std::vector<int>* v = new std::vector<int> ;
    v->push_back( 1 ) ;
    v->push_back( 2 ) ;

    /* wrap the pointer as an external pointer */
    /* this automatically protected the external pointer from R garbage 
     collection until p goes out of scope. */
    Rcpp::XPtr< std::vector<int> > p(v, true) ;

    /* return it back to R, since p goes out of scope after the return 
     the external pointer is no more protected by p, but it gets 
     protected by being on the R side */
    return( p ) ;
}

R

library(Rcpp)
sourceCpp("test.cpp")

xp <- funx()
xp
<pointer: 0x9618cc0>

But if I try to parallelize this I get null pointers

library(parallel)
out <- mclapply(1:2, function(x) funx())
out
[[1]]
<pointer: (nil)>

[[2]]
<pointer: (nil)>

Is it possible to achieve this kind of functionality?

Edit

It is worth noting that despite a duplicate question there appears to be no true solution to this problem. From what I understand now, an XPtr is not able to be multi-threaded. So essentially this cannot be done in R.

For example, when I put the function inside package test and try to use snow it still fails to return the pointers.

library(test)
library(snow)

fun <- function(){
  library(test)
  test:::funx()
}

cl <- makeCluster(2, type = "SOCK") 
clusterExport(cl, 'fun') 
clusterCall(cl, fun)
[[1]]
<pointer: (nil)>

[[2]]
<pointer: (nil)>
cdeterman
  • 19,630
  • 7
  • 76
  • 100
  • There is a new package called `RcppParallel`, which may be helpful here. – lmo May 11 '16 at 15:50
  • 1
    Thanks to @nrussell for pointing to the dupe. But i still would not recommend sending a `SEXP` into multiple threads as a `gc()` may occur resulting in tears. – Dirk Eddelbuettel May 11 '16 at 17:57
  • @DirkEddelbuettel so looking at the 'dupe' the recommendation is to basically BATCH the calls? An example is not provided in the answer if this is possible in a single R session. – cdeterman May 11 '16 at 18:15
  • 1
    "Write a package". Which is basically the answer to every question about deployment ... – Dirk Eddelbuettel May 11 '16 at 18:19
  • @DirkEddelbuettel I understand that but I am still wondering about the parallelization. I put the `XPtr` function in a package, check. Now how can it be multithreaded? Sorry if I am being obtuse here, I just don't understand. – cdeterman May 11 '16 at 18:20
  • You can NOT have ANY R data structure in a multi-threaded context because R is single-threaded. `XPtr` is (lightweight proxy of) a R data structure. – Dirk Eddelbuettel May 11 '16 at 18:25
  • @DirkEddelbuettel okay, and I assume that applies even if the `XPtr` is referencing a C++ data structure like `arma::mat` or `Eigen::MatrixXd` because `XPtr` is a proxy of an R structure. Correct? – cdeterman May 11 '16 at 18:30

1 Answers1

1

Regarding

Is it possible to achieve this kind of functionality?

I would say the answer is a pretty firm 'nope' as the First Rule of Fight Club applies here: you simply cannot parallelise the underlying R instance merely by hoping it would work. Packages like RcppParallel are very careful about using non-R data structures for multithreaded work.

I may be too pessimistic but I would place the 'collection level' one level deeper, and only return its aggregated result to R.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • So would it be possible to use `RcppParallel` to return a `list` of `XPtr` objects? The objects I am trying to create can't be aggregated so I am trying to find out if there is any possible way to parallelize and return pointers to the objects created in each 'child' process. – cdeterman May 11 '16 at 17:01