3

I am working on a package, which uses random numbers from RcppArmadillo. The package runs MCMC algorithms, and to obtain exact reproducibility, the user should be able to set a random number seed. When doing this, it seems like the arma::randg() function for generating random numbers from the gamma distribution returns different values across platforms. This is not the case for arma::randu() or arma::randn(). Could it be related to the fact that arma::randg() requires C++11?

Here is what I get on macOS 10.13.6, running R3.5.2:

library(Rcpp)
library(RcppArmadillo)

sourceCpp(code = '
#include <RcppArmadillo.h>
using namespace Rcpp;

 // [[Rcpp::plugins(cpp11)]]
 // [[Rcpp::depends(RcppArmadillo)]]

 // [[Rcpp::export]]
 double random_gamma() {
 return arma::randg();
 }

 // [[Rcpp::export]]
 double random_uniform() {
 return arma::randu();
 }

 // [[Rcpp::export]]
 double random_normal() {
 return arma::randn();
 }
 '
)

replicate(2, {set.seed(1); random_gamma()})
#> [1] 1.507675 1.507675
replicate(2, {set.seed(432); random_gamma()})
#> [1] 0.02234341 0.02234341
replicate(2, {set.seed(1); random_uniform()})
#> [1] 0.2655087 0.2655087
replicate(2, {set.seed(1); random_normal()})
#> [1] -1.390378 -1.390378

Created on 2019-02-22 by the reprex package (v0.2.1)

Here is what I get on Windows 10, running R3.5.2:

library(Rcpp)
library(RcppArmadillo)

sourceCpp(code = '
          #include <RcppArmadillo.h>
          using namespace Rcpp;

          // [[Rcpp::plugins(cpp11)]]
          // [[Rcpp::depends(RcppArmadillo)]]

          // [[Rcpp::export]]
          double random_gamma() {
          return arma::randg();
          }

          // [[Rcpp::export]]
          double random_uniform() {
          return arma::randu();
          }

          // [[Rcpp::export]]
          double random_normal() {
          return arma::randn();
          }
          '
)

replicate(2, {set.seed(1); random_gamma()})
#> [1] 0.2549381 0.2549381
replicate(2, {set.seed(432); random_gamma()})
#> [1] 0.2648896 0.2648896
replicate(2, {set.seed(1); random_uniform()})
#> [1] 0.2655087 0.2655087
replicate(2, {set.seed(1); random_normal()})
#> [1] -1.390378 -1.390378

Created on 2019-02-22 by the reprex package (v0.2.1)

As can be seen, the random numbers generated with arma::randg() are internally consistent, but differ between the platforms.

I tried to set the seed using the instructions in the Armadillo documentation:

library(Rcpp)
library(RcppArmadillo)

sourceCpp(code = '
          #include <RcppArmadillo.h>
          using namespace Rcpp;

          // [[Rcpp::plugins(cpp11)]]
          // [[Rcpp::depends(RcppArmadillo)]]

          // [[Rcpp::export]]
          double random_gamma(int seed) {
          arma::arma_rng::set_seed(seed);
          return arma::randg();
          }
          '
)

replicate(4, random_gamma(1))
#> Warning in random_gamma(1): When called from R, the RNG seed has to be set
#> at the R level via set.seed()
#> [1] 1.3659195 0.6447221 1.1771862 0.9099034

Created on 2019-02-22 by the reprex package (v0.2.1)

However, as the warning tells, and the result shows, the seed does not get set this way.

Is there a way to get reproducible results between platforms when using arma::randg(), or do I need to implement a gamma distribution using some of the other random number generators available in RcppArmadillo?

Update

As pointed out in the comment, using R::rgamma() solves this problem. The following code returns the same numbers on both Mac and Windows:

library(Rcpp)

sourceCpp(code = '
          #include <Rcpp.h>
          using namespace Rcpp;

          // [[Rcpp::export]]
          double random_gamma() {
            return R::rgamma(1.0, 1.0);
          }
          '
)

replicate(2, {set.seed(1); random_gamma()})
#> [1] 0.1551414 0.1551414

Created on 2019-02-22 by the reprex package (v0.2.1)

This solves it for me. However, I am not sure if the issue is solved, since this seems like unexpected behavior, so leaving it open.

Øystein S
  • 536
  • 2
  • 11
  • 4
    Is there any specific reason for using the Armadillo rather than the Rcpp rgamma function? if the issue didnt persist across platforms using the `Rcpp::rgamma()` equivalent, you could use either the output from this in your calculations, or if the type differs cast it to the type you need during your calculations. – Oliver Feb 22 '19 at 08:51
  • 2
    Actually, I had no reason for using Armadillo other than that the whole package heavily uses other Armadillo functions. I changed it now, and it works. – Øystein S Feb 22 '19 at 09:15
  • 3
    I have not checked it, but probably Armadillo uses `std::gamma_distribution` from C++11's `` header. Unfortunately the distributions are "implementation defined", which is probably the reason why WRE advises against using that header. – Ralf Stubner Feb 22 '19 at 09:25
  • 3
    Looking up the source code, it seems you are absolutely right. `` [is included](https://gitlab.com/conradsnicta/armadillo-code/blob/9.300.x/include/armadillo#L57-65) if C++11 is used, and `arma::randg()` [only works with C++ 11](https://gitlab.com/conradsnicta/armadillo-code/blob/9.300.x/include/armadillo_bits/fn_randg.hpp). – Øystein S Feb 22 '19 at 09:52
  • 1
    Relying on a specific sequence of random numbers is generally a bad idea. What if R developers decide to slightly change the RNG, say for speed reasons? A more robust way to implement reproducibility is through the statistics of the random number sequence. For example, take the mean and see if it falls within specific bounds. – hbrerkere Feb 23 '19 at 02:12
  • Absolutely, but say if I want to submit a paper with an analysis based on the package. Then the journal may require a script for exactly reproducing the results, to every decimal. Although the conclusions should not depend on the seed chosen, things like cluster labels may do. For example, between two given runs, cluster 1 and cluster 2 may be switched. In addition, this was for package development, and letting the user be able to set the seed seems reasonable. – Øystein S Feb 23 '19 at 04:35

1 Answers1

2

Summarizing the discussion in the comments:

Ralf Stubner
  • 26,263
  • 3
  • 40
  • 75