5

Consider these 2 functions:

library(Rcpp)

cppFunction("NumericVector func1(NumericVector &x)
{
    for (int i = 0; i < x.length(); i++)
        x[i] = x[i] * 2;
    return x;
}")


cppFunction("NumericVector func2(NumericVector x)  // no &
{
    for (int i = 0; i < x.length(); i++)
        x[i] = x[i] * 2;
    return x;
}")

The only difference is that func1 takes x as a reference parameter, whereas func2 takes it as a value. If this was regular C++, I'd understand this as func1 being allowed to change the value of x in the calling code, whereas this won't happen in func2.

However:

> x <- 1:10/5  # ensure x is numeric, not integer
> x
 [1] 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
> func1(x)
 [1] 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0
> x
 [1] 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0  # x in calling env has been modified


> x <- 1:10/5  # reset x
> x
 [1] 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0
> func2(x)
 [1] 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0
> x
 [1] 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0  # x is also modified

So it looks like func1 and func2 behave the same way, as far as side-effects on the arguments are concerned.

What is the reason for this? In general, is it better to pass arguments to Rcpp functions by reference or by value?

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • 1
    Rcpp objects are passed from R as references (even without the `&`). I have 2 slides on that in [my Rcpp presentation](https://privefl.github.io/R-presentation/Rcpp.html#29). – F. Privé Mar 14 '18 at 12:09
  • 2
    See [Rcpp FAQ 5.1](https://cloud.r-project.org/web/packages/Rcpp/vignettes/Rcpp-FAQ.pdf); we have explained this quite a few times over the last eight or nine years. Comes with the `SEXP` territory. – Dirk Eddelbuettel Mar 14 '18 at 12:23

1 Answers1

1

First, both your functions return a NumericVector that is not being assigned to any variable, and therefore is not being used. The code below is equivalent to what you have, as you are discarding the returned NumericVector anyhow.

cppFunction("void func1(NumericVector& x)
            {
            for (int i = 0; i < x.length(); i++)
            x[i] = x[i] * 2;
            }")


cppFunction("void func2(NumericVector x)  // no &
            {
            for (int i = 0; i < x.length(); i++)
            x[i] = x[i] * 2;
            }")

x <- 1:10/5
func1(x)
print(x)

x <- 1:10/5
func2(x)
print(x)

Second, a NumericVector behaves as a pointer in the C++ functions. The pointer gives you the address where the values are stored, and to be able to change the values at that address, you only need to know the address, but you don't need the ability to modify the address itself. Therefore, there is no difference in passing the pointer by value or passing it by reference.

This thread contains useful knowledge on the behavior of NumericVector:

Should I prefer Rcpp::NumericVector over std::vector?

The program below demonstrates the same behavior in C++.

#include <iostream>

void func1(double* a) // The pointer is passed by value.
{
    for (int i=0; i<3; ++i)
        a[i] *= 2;
}

void func2(double*& a) // The pointer is passed by reference.
{
    for (int i=0; i<3; ++i)
        a[i] *= 2;
}

void print(double* a)
{
    std::cout << "Start print:" << std::endl;
    for (int i=0; i<3; ++i)
        std::cout << a[i] << std::endl;
}

int main()
{
    double* x = new double[3];

    // Set the values with 1, 2, and 3.
    for (int i = 0; i<3; ++i)
        x[i] = i+1;

    print(x);
    func1(x);
    print(x);

    // Reset the values with 1, 2, and 3.
    for (int i = 0; i<3; ++i)
        x[i] = i+1;

    // This block shows the same behavior as the block above.
    print(x);
    func2(x);
    print(x);

    delete[] x;
}
Chiel
  • 6,006
  • 2
  • 32
  • 57