15

I made a first stab at an Rcpp function via inline and it solved my speed problem (thanks Dirk!): Replace negative values by zero

The initial version looked like this:

library(inline)
cpp_if_src <- '
  Rcpp::NumericVector xa(a);
  int n_xa = xa.size();
  for(int i=0; i < n_xa; i++) {
    if(xa[i]<0) xa[i] = 0;
  }
  return xa;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")

But when called cpp_if(p), it overwrote p with the output, which was not as intended. So I assumed it was passing by reference.

So I fixed it with the following version:

library(inline)
cpp_if_src <- '
  Rcpp::NumericVector xa(a);
  int n_xa = xa.size();
  Rcpp::NumericVector xr(a);
  for(int i=0; i < n_xa; i++) {
    if(xr[i]<0) xr[i] = 0;
  }
  return xr;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")

Which seemed to work. But now the original version doesn't overwrite its input anymore when I re-load it into R (i.e. the same exact code now doesn't overwrite its input):

> cpp_if_src <- '
+   Rcpp::NumericVector xa(a);
+   int n_xa = xa.size();
+   for(int i=0; i < n_xa; i++) {
+     if(xa[i]<0) xa[i] = 0;
+   }
+   return xa;
+ '
> cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
> 
> p
 [1] -5 -4 -3 -2 -1  0  1  2  3  4  5
> cpp_if(p)
 [1] 0 0 0 0 0 0 1 2 3 4 5
> p
 [1] -5 -4 -3 -2 -1  0  1  2  3  4  5

I'm not the only one who has tried to replicate this behavior and found inconsistent results:

https://chat.stackoverflow.com/transcript/message/4357344#4357344

What's going on here?

Community
  • 1
  • 1
Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • Can you reformulate your question? Do you, or don't you, want to overwrite? Seems to me that version 2 achieves what it set out to do... Also, there is a dedicated mailing list devoted to Rcpp where you are likely to get decent answers. – Dirk Eddelbuettel Jul 02 '12 at 20:10
  • Tried to edit for clarity. I do not want it to overwrite. If this is non-obvious then I guess I should post to the mailing list, but didn't want to bother folks otherwise. – Ari B. Friedman Jul 02 '12 at 20:21

1 Answers1

22

They key is 'proxy model' -- your xa really is the same memory location as your original object so you end up changing your original.

If you don't want that, you should do one thing: (deep) copy using the clone() method, or maybe explicit creation of a new object into which the altered object gets written. Method two does not do that, you simply use two differently named variables which are both "pointers" (in the proxy model sense) to the original variable.

An additional complication, though, is in implicit cast and copy when you pass an int vector (from R) to a NumericVector type: that creates a copy, and then the original no longer gets altered.

Here is a more explicit example, similar to one I use in the tutorials or workshops:

library(inline)
f1 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
  Rcpp::NumericVector xa(a);
  int n = xa.size();
  for(int i=0; i < n; i++) {
    if(xa[i]<0) xa[i] = 0;
  }
  return xa;
')

f2 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
  Rcpp::NumericVector xa(a);
  int n = xa.size();
  Rcpp::NumericVector xr(a);            // still points to a
  for(int i=0; i < n; i++) {
    if(xr[i]<0) xr[i] = 0;
  }
  return xr;
')

p <- seq(-2,2)
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))
p <- as.numeric(seq(-2,2))
print(class(p))
print(cbind(f1(p), p))
print(cbind(f2(p), p))

and this is what I see:

edd@max:~/svn/rcpp/pkg$ r /tmp/ari.r
Loading required package: methods
[1] "integer"
        p
[1,] 0 -2
[2,] 0 -1
[3,] 0  0
[4,] 1  1
[5,] 2  2
        p
[1,] 0 -2
[2,] 0 -1
[3,] 0  0
[4,] 1  1
[5,] 2  2
[1] "numeric"
       p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
       p
[1,] 0 0
[2,] 0 0
[3,] 0 0
[4,] 1 1
[5,] 2 2
edd@max:~/svn/rcpp/pkg$

So it really matters whether you pass int-to-float or float-to-float.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • 1
    Thanks Dirk. That tells me how I should have written the function (and I'll file it away for future use). And I think the implicit cast and copy likely explains the seeming inconsistency. – Ari B. Friedman Jul 02 '12 at 20:58
  • 1
    It's a head-scratcher when you first encounter it, but yes, it does make sense. – Dirk Eddelbuettel Jul 02 '12 at 21:02
  • Very interesting. One minor niggle for clarity: shouldn't p be redefined after each function call - particularly the second call to f1? Otherwise it's the altered p that's being fed into f2... right? – Tim P Jul 03 '12 at 23:41
  • This is quite interesting. The only way to know this is through a debugger I guess (or you write the package like Dirk did). Is there a way to attach a debugger to Rcpp functions? I am thinking about a way like Visual Studio debugger attaches a dll, then when it is hit you can go into the code. . – adam Feb 26 '15 at 16:33
  • @adam: gcc/g++ are decades old, and so is their best friend gdb. – Dirk Eddelbuettel Feb 26 '15 at 16:41