0

I am trying to parallelize Rcpp code. From this post, I was able to get my MRE to run and produce the expected output by just sourcing the functions:

> Rcpp::sourceCpp("src/rnorm_c.cpp")
> source("~/<path to project folder>/rnormpar/R/normal_mat.R")
> norm_mat_par()
[[1]]
           [,1]
[1,] -0.1117342

[[2]]
           [,1]
[1,] 0.05094005

[[3]]
          [,1]
[1,] 0.1137641

[[4]]
          [,1]
[1,] 0.8624004

[[5]]
          [,1]
[1,] 0.7821107

However, after building and running the function from within the package, the output changed to:

Restarting R session...

> library(rnormpar)
> rnormpar::norm_mat_par()
[[1]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[2]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[3]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[4]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

[[5]]
<simpleError in .Call("_rnormpar_rnorm_n", PACKAGE = "rnormpar",     n, mu, sd): "_rnormpar_rnorm_n" not available for .Call() for package "rnormpar">

Here is the code for my MRE. It consists of two scripts. The first is the Rcpp code:

#include <RcppArmadillo.h>
//[[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;

// function to generate a single sample from the standard normal distribution
//[[Rcpp::export]]
double rnorm1() {
  return (double)arma::vec(1, arma::fill::randn)(0, 0);
}

// function to return a vector of n samples from the normal distribution
//[[Rcpp::export]]
arma::vec rnorm_n(int n = 1, double mu = 0, double sd = 1){

  arma::vec res(n);

  for (int j = 0; j < n; j++){
    res(j) = rnorm1();
  }

  res = res * sd + mu;

  return res;
}

The second is the R code:

# generates a matrix distributed independent normal
# takes n, p, mean vector, and sd vector representing the diagonal of the
# covariance matrix
#' Normal matrix
#'
#' @param n sample size
#' @param p number of variables
#' @param mu mean vector
#' @param sd diagonal of the covariance matrix
#'
#' @return normal matrix
#' @export
#'
#' @examples norm_mat(1e2, 3, -1:1, 1:3)
norm_mat <- function(n = 1, p = 1, mu = rep(0, p), sd = rep(1, p)){

  res <- matrix(NA, n, p)

  for(j in 1:p){
    res[ , j] <- rnorm_n(n, mu[j], sd[j])
  }

  return(res)

}

#' Title
#'
#' @return
#' @export
#'
#' @examples
norm_mat_par <- function(){

  nworkers <- parallel::detectCores() - 1

  cl <- parallel::makeCluster(nworkers)

  doParallel::registerDoParallel(cl)

  x <- foreach::`%dopar%`(
    foreach::foreach(j = 1:5, .errorhandling='pass', .export = "norm_mat",
                     .noexport = c("rnorm_n", "rnorm1"), .packages = c("Rcpp")),
    {
      sourceCpp("src/rnorm_c.cpp")
      norm_mat()
    })

  parallel::stopCluster(cl)

  return(x)
}

This is the github repo for my MRE

Thanks in advance to everyone taking the time to respond!

Jacob Helwig
  • 41
  • 1
  • 5
  • 1
    Why are you using `sourceCpp` within a package? That's not how you do it. Note the `RcppExports.R` file? Just call `rnorm1` or `rnorm_n` instead. – Roland Jan 04 '22 at 06:42
  • 1
    Also, your package is missing a dependency on `doParallel` and `foreach`. – Roland Jan 04 '22 at 06:44
  • @Roland I added the dependencies and pushed. Sourcing within the package is an idea from the response I linked. Removing it, I get the same error from within the package, however, sourcing the functions now produces the error `` – Jacob Helwig Jan 04 '22 at 16:36
  • @JacobHelwig There is nothing special here. We answered dozens of times over the years how to best launch code in parallel for R, with and without R. The answer almost always is "create a package" so I recommend you do. You can of course repeat the same questions all these other folks have already asked but folks may be too bored to repeat the answers one more time. It is all here (and in other places) so I really recommend you continue reading and looking. The one reference you picked may not be the best one. – Dirk Eddelbuettel Jan 04 '22 at 17:26
  • Sorry: "with and without Rcpp" in previous comment. – Dirk Eddelbuettel Jan 04 '22 at 17:56

1 Answers1

2

The GitHub repo rcpp-and-doparallel provided the solution.

I will demonstrate here how I modified my package - the corresponding commit in the rnormpar repo has commit message "Solved parallelization".

First, I modified the R script titled rnorm_package.R that I created for registering my cpp functions to mirror that of the rcpp-and-doparallel package:

#' @keywords internal
"_PACKAGE"

# The following block is used by usethis to automatically manage
# roxygen namespace tags. Modify with care!
## usethis namespace: start
#' @useDynLib rnormpar, .registration = TRUE
#' @importFrom Rcpp sourceCpp
## usethis namespace: end
NULL

I then deleted and re-generated my NAMESPACE using devtools::document(). This caused the following lines to be added to NAMESPACE:

importFrom(Rcpp,sourceCpp)
useDynLib(rnormpar, .registration = TRUE)

If these lines are already in the NAMESPACE, then the first two steps are perhaps not necessary.

Finally, I modified the arguments to the foreach function so that my package was passed to the workers:

norm_mat_par <- function(){

  nworkers <- parallel::detectCores() - 1

  cl <- parallel::makeCluster(nworkers)

  doParallel::registerDoParallel(cl)

  x <- foreach::`%dopar%`(
    foreach::foreach(j = 1:5, .packages = "rnormpar"),
    {
      norm_mat()
    })

  parallel::stopCluster(cl)

  return(x)
}

After building the package, the function produces the expected output:

Restarting R session...

> library(rnormpar)
> rnormpar::norm_mat_par()
[[1]]
          [,1]
[1,] -1.948502

[[2]]
           [,1]
[1,] -0.2774582

[[3]]
          [,1]
[1,] 0.1710537

[[4]]
         [,1]
[1,] 1.784761

[[5]]
           [,1]
[1,] -0.5694733
Jacob Helwig
  • 41
  • 1
  • 5