10

I commonly work with a short Rcpp function that takes as input a matrix where each row contains K probabilities that sum to 1. The function then randomly samples for each row an integer between 1 and K corresponding to the provided probabilities. This is the function:

// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadilloExtensions/sample.h>

using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
  int n = x.nrow();
  IntegerVector result(n);
  for ( int i = 0; i < n; ++i ) {
    result[i] = RcppArmadillo::sample(choice_set, 1, false, x(i, _))[0];
  }
  return result;
}

I recently updated R and all packages. Now I cannot compile this function anymore. The reason is not clear to me. Running

library(Rcpp)
library(RcppArmadillo)
Rcpp::sourceCpp("sample_matrix.cpp")

throws the following error:

error: call of overloaded 'sample(Rcpp::IntegerVector&, int, bool, Rcpp::Matrix<14>::Row)' is ambiguous

This basically tells me that my call to RcppArmadillo::sample() is ambiguous. Can anyone enlighten me as to why this is the case?

yrx1702
  • 1,619
  • 15
  • 27

1 Answers1

10

There are two things happening here, and two parts to your problem and hence the answer.

The first is "meta": why now? Well we had a bug let in the sample() code / setup which Christian kindly fixed for the most recent RcppArmadillo release (and it is all documented there). In short, the interface for the very probability argument giving you trouble here was changed as it was not safe for re-use / repeated use. It is now.

Second, the error message. You didn't say what compiler or version you use but mine (currently g++-9.3) is actually pretty helpful with the error. It is still C++ so some interpretative dance is needed but in essence it clearly stating you called with Rcpp::Matrix<14>::Row and no interface is provided for that type. Which is correct. sample() offers a few interface, but none for a Row object. So the fix is, once again, simple. Add a line to aid the compiler by making the row a NumericVector and all is good.

Fixed code

#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>

// [[Rcpp::depends(RcppArmadillo)]]

using namespace Rcpp;

// [[Rcpp::export]]
IntegerVector sample_matrix(NumericMatrix x, IntegerVector choice_set) {
  int n = x.nrow();
  IntegerVector result(n);
  for ( int i = 0; i < n; ++i ) {
    Rcpp::NumericVector z(x(i, _));
    result[i] = RcppArmadillo::sample(choice_set, 1, false, z)[0];
  }
  return result;
}

Example

R> Rcpp::sourceCpp("answer.cpp")        # no need for library(Rcpp)   
R> 
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • Why does this still give different results compared to `base::sample` with same `set.seed()`/`set.seed(, sample.kind="Rounding")`. See my recent [answer](https://stackoverflow.com/a/73001876/6574038). – jay.sf Jul 17 '22 at 10:27
  • Presumably because the implementation is not identical. It is also non-trivial, but the files for all three approaches (there is also a `sample` in Rcpp now) are open source so someone with time and interest -- maybe you? -- could drill down and debug. – Dirk Eddelbuettel Jul 17 '22 at 10:59
  • I've [implemented](https://stackoverflow.com/a/73001876/6574038) the function using `Rcpp::sample`, gives identical results now, I think it's perfect, thanks a ton! – jay.sf Jul 17 '22 at 11:52