0

I use Rcpp to outsource a nested loop from R (the time improvements are huge!). But I have encountered a strange behavior now. The first function testrun takes as argument a matrix X and loops over every row of that matrix. The row is passed to another function testfun which runs another loop and performs some computations. For every computation a value is returned and stored and eventually returned by the testrun function:

// [[Rcpp::export]]
Rcpp::NumericVector testrun(Rcpp::NumericMatrix X, Rcpp::NumericVector wal)
{
    Rcpp::NumericVector c;

    for(int i = 0; i != X.rows(); ++i)
    {
        c.push_back(testfun(X.row(i), wal));
    }

    return c;
}

The function testfun works on every row of X:

// [[Rcpp::export]]
int testfun(Rcpp::NumericVector x, Rcpp::NumericVector val)
{   
    int t = 0;
    Rcpp::NumericVector val2 = val;

    while(1)
    {
        std::vector<double> a;
        std::vector<double> b;

        for(int j = 0; j != x.length(); ++j)
        {
            a.push_back(std::sin(x.at(j)));
            b.push_back(R::rnorm(1.0, 1.0));
        }

        for(int j = 0; j != x.length(); ++j)
        {   
            if(a.at(j) > b.at(j))
            {
                val2[j] = 1;   
            }
        }

        if(Rcpp::sum(val2) == x.length())
        {
            break;
        }

        ++t;
    }

    return t;
}

The second argument of testrun and testfun is a vector of "modifier" values which are used in the computation. The values of this vector get updated depending on the computation and thus serve as exit condition for the loop.

I call testrun from R

X = expand.grid(replicate(2, list(seq(-3,3,length.out=10))))
wal = c(0.4, 0.2)

testrun(X, wal)

Both values in wal are < 1. But apparently the values in wal (in R!) get updated when updating val2 in testfun. That is, the first time the exit condition is satisfied, it is satisfied for all subsequent steps. Why is this the case? And how do I prevent this behavior? If I use a single double for wal (and val2) it works as intended.

Syd
  • 23
  • 2
  • 2
    This is pretty well documented behavior, but also a common thing for newcomers to trip up on. The short answer is change the line `Rcpp::NumericVector val2 = val;` to `Rcpp::NumericVector val2 = Rcpp::clone(val);`. The longer answer (as to *why*), is covered in several places, such as [this Stack Overflow answer by Dirk Eddelbuettel himself](https://stackoverflow.com/a/11300707/8386140) – duckmayr Dec 28 '19 at 14:08
  • 2
    See the Rcpp FAQ; what you pass are internal `SEXP` objects where P stands for pointer. So `val2` alters `val` as they are both pointers to the same memory. If you want a distinct deep copy, make one. – Dirk Eddelbuettel Dec 28 '19 at 14:08
  • 2
    Also don't use `push_back()` on Rcpp vectors -- very inefficient for similar reasons of R implementation details. – Dirk Eddelbuettel Dec 28 '19 at 14:11
  • Thanks for the replies. In fact, I wasn't aware of this (but as duckmayr spotted: I am a newbie to Rcpp and advanced in C++ neither – the huge performance improvements made me use it and it's great :-D) thank you! – Syd Dec 28 '19 at 21:10

0 Answers0