4

In an Rcpp project I would like to be able to either call an R function (the cobs function from the cobs package to do a concave spline fit) or call the fortran code that it relies on (the cobs function uses quantreg's rq.fit.sfnc function to fit the constrained spline model, which in turn relies on the fortran coded srqfnc function in the quantreg package) within a pragma openmp parallel for loop (the rest of my code mainly requires some simple linear algebra, so that would be no problem, but sadly each inner loop iteration also requires me to do a concave spline fit). I was wondering if this would be allowed or possible though, as I presume such calls would not be thread safe? Would there be an easy fix for this, like to surround those calls by a #pragma omp critical ? Would anyone have any examples of this? Or would the only way in this case involve doing a full Rcpp port of the cobs & rq.fit.sfnc functions first using thread-safe Armadillo classes?

Tom Wenseleers
  • 7,535
  • 7
  • 63
  • 103

2 Answers2

4

Citing the manual:

Calling any of the R API from threaded code is ‘for experts only’ and strongly discouraged. Many functions in the R API modify internal R data structures and might corrupt these data structures if called simultaneously from multiple threads. Most R API functions can signal errors, which must only happen on the R main thread. Also, external libraries (e.g. LAPACK) may not be thread-safe.

I have always interpreted this as "one must not call an R API function from threaded code". Calling an R function, irregardless of what's used inside, from inside a omp parallel region would be just that. Using #pragma omp critical might work, but if it breaks you got to keep the pieces ...

It would be safer to either re-implement the code in question or look for an existing implementation in C++/C/Fortran and call that directly.

Ralf Stubner
  • 26,263
  • 3
  • 40
  • 75
  • 1
    Thanks for the advice - so I just tried and calling R code within a #pragma openmp parallel for loop indeed only works if one adds a #pragma omp critical... I've posted an example below... Might still have its uses though... But I'll probably have to port cobs and quantreg to Rcpp for optimal performance... – Tom Wenseleers Mar 19 '19 at 21:37
  • If you see anything weird in my Rcpp code let me know btw! – Tom Wenseleers Mar 19 '19 at 21:40
3

So I just tried, and it seems that calling R functions within a #pragma openmp parallel for loop only works if it is preceded by #pragma omp critical (otherwise it causes a stack imbalance, and crashes R). Of course this will then cause that part of the code to execute sequentially, but this might still be useful in some cases.

Example:

The Rcpp part, saved as file "fitMbycol.cpp":

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
// #define RCPP_ARMADILLO_RETURN_COLVEC_AS_VECTOR
using namespace Rcpp;
using namespace arma;
using namespace std;

#include <omp.h>
// [[Rcpp::plugins(openmp)]]

// [[Rcpp::export]]
arma::mat fitMbycol(arma::mat& M, Rcpp::Function f, const int nthreads) {

  // ARGUMENTS
  // M: matrix for which we want to fit given function f over each column
  // f: fitting function to use with one single argument (vector y) that returns the fitted values as a vector
  // nthreads: number of threads to use

  // we apply fitting function over columns
  int c = M.n_cols;
  int r = M.n_rows;
  arma::mat out(r,c);
  int i;
  omp_set_num_threads(nthreads);
#pragma omp parallel for shared(out)
  for (i = 0; i < c; i++) {
      arma::vec y = M.col(i); // ith column of M
#pragma omp critical
{
      out.col(i) = as<arma::colvec>(f(NumericVector(y.begin(),y.end())));
}
  }

  return out;

}

And then in R:

First the pure R version:

(we simulate some Gaussian peak shapes with Poisson noise and then do a log-concave spline fit on them using the cobs function)

x=1:100
n=length(x)
ncols=50
gauspeak=function(x, u, w, h=1) h*exp(((x-u)^2)/(-2*(w^2)))
Y_nonoise=do.call(cbind,lapply(seq(min(x), max(x), length.out=ncols), function (u) gauspeak(x, u=u, w=10, h=u*100)))
set.seed(123)
Y=apply(Y_nonoise, 2, function (col) rpois(n,col))

# log-concave spline fit on each column of matrix Y using cobs
require(cobs)
logconcobs = function(y, tau=0.5, nknots=length(y)/10) {
  x = 1:length(y)
  offs = max(y)*1E-6
  weights = y^(1/2)
  fit.y = suppressWarnings(cobs(x=x,y=log10(y+offs), 
              constraint = "concave", lambda=0, 
              knots = seq(min(x),max(x), length.out = nknots), 
              nknots=nknots, knots.add = FALSE, repeat.delete.add = FALSE,
              keep.data = FALSE, keep.x.ps = TRUE,
              w=weights, 
              tau=tau, print.warn = F, print.mesg = F, rq.tol = 0.1, maxiter = 100)$fitted)
  return(pmax(10^fit.y - offs, 0))
}
library(microbenchmark)
microbenchmark(Y.fitted <- apply(Y, 2, function(col) logconcobs(y=col, tau=0.5)),times=5L) # 363 ms, ie 363/50=7 ms per fit
matplot(Y,type="l",lty=1)
matplot(Y_nonoise,type="l",add=TRUE, lwd=3, col=adjustcolor("blue",alpha.f=0.2),lty=1)
matplot(Y.fitted,type="l",add=TRUE, lwd=3, col=adjustcolor("red",alpha.f=0.2),lty=1)

enter image description here

And now using Rcpp calling our R fitting function logconcobs within a #pragma openmp parallel for loop, enclosed with a #pragma omp critical:

library(Rcpp)
library(RcppArmadillo)
Rcpp::sourceCpp('fitMbycol.cpp')
microbenchmark(Y.fitted <- fitMbycol(Y, function (y) logconcobs(y, tau=0.5, nknots=10), nthreads=8L ), times=5L) # 361 ms

OpenMP of course ends up not having any effect in this case, as the #pragma omp critical causes everything to execute sequentially, but in more complex examples this could still be useful.

Tom Wenseleers
  • 7,535
  • 7
  • 63
  • 103