1

I would like for my C function to be able to manipulate some values stored in an R data frame.

To achieve this, a need the (real) memory address where the R data frame stores its data (hopefully in a contiguous way); then from R, I call the C function and passing this memory address as a parameter.

The question: how can we get the memory address of the R data frame?

dww
  • 30,425
  • 5
  • 68
  • 111
  • The lobstr-package should have what you want. Try the lobstr::obj_addr() function! – Jarn Schöber Jan 09 '20 at 11:59
  • 1
    Your proposed approach does not appear sensible. Have you studied the [relevant part](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Handling-R-objects-in-C) of "Writing R Extensions"? – Roland Jan 09 '20 at 12:14
  • @Roland: Thanks for the link - this seems the way to go. Do you know the `SEXPTYPE` of a data frame? Unfortunately, I couldn't figure it out just by looking at the link. –  Jan 09 '20 at 12:34
  • 4
    A data.frame is a list with a class attribute "data.frame" and a few other attributes. Use `dput(yourdataframe)` in R to see the data structure. https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Handling-lists – Roland Jan 09 '20 at 12:51
  • @Roland thanks again for the info! Would you mind to post some C code (using R extensions) that could do the same as the `dput` function? –  Jan 09 '20 at 14:48
  • 1
    Sorry, but no. I don't speak C myself. At most, I can dabble a bit in C++ and that only using Rcpp. I'm not sure why you need that but you can study the source code of `dput` there: https://github.com/wch/r-source/blob/58964c22a2e8f47a27e648f8fc68fac14bfeda63/src/main/deparse.c#L367 – Roland Jan 09 '20 at 15:00
  • 2
    A minimal example: https://stackoverflow.com/q/6658168/1968 — A minimal intro into using R packages with compiled code: https://r-pkgs.org/src.html. – Konrad Rudolph Jan 09 '20 at 17:44

1 Answers1

6

Rcpp passes by reference. I.e. it passes pointers to R objects, not their values. So you can manipulate them in C++ in the same way as any pointer.

example

library(Rcpp)

cppFunction('
void f1(DataFrame x) {
  IntegerVector V1 = x["V1"];
  V1 = V1 * 2;
  }
')

x = data.frame(V1 = 1:5, V2 = 1:5)
f1(x)
x
#   V1 V2
# 1  2  1
# 2  4  2
# 3  6  3
# 4  8  4
# 5 10  5
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
dww
  • 30,425
  • 5
  • 68
  • 111
  • This seems to be what I need! Would you mind to post the C/C++ code of function `cppFunction` that makes it possible to do this multiplications/rewriting of a data frame column? –  Jan 09 '20 at 14:51
  • 6
    The C/C++ code is given in the example (i.e. it is passed as a text string to `cppFunction`). `cppFunction` is just a shorthand way to define C++ functions inline with R code. You can also write your C++ functions in a separate file in the normal way and then compile/link them to R using `sourceCpp`. Basically `RCpp` package has already done the hard work for you of integrating R and C using pointers - no need to reinvent the wheel. Have a look at http://adv-r.had.co.nz/Rcpp.html for a quick intro and for more detailed version see http://dirk.eddelbuettel.com/code/rcpp.html – dww Jan 09 '20 at 17:32
  • 6
    And _please_ consider reading at least the few pages of the [Rcpp Introduction](https://cloud.r-project.org/web/packages/Rcpp/vignettes/Rcpp-introduction.pdf) vignette. – Dirk Eddelbuettel Jan 09 '20 at 17:53
  • @DirkEddelbuettel Thanks for the pointers.Would you mind to send a link that points to the portion of the RCpp code where the manipulation of a data frame takes place? –  Jan 09 '20 at 21:33
  • I am afraid you may still have to wrong mental model of how this works which is why I already suggested reading up to at least the aforementioned [Rcpp Introduction](https://cloud.r-project.org/web/packages/Rcpp/vignettes/Rcpp-introduction.pdf) vignette. You could consider many other next steps: the [RcppExamples](https://cran.r-project.org/package=RcppExamples) package has a `data.frame` example, and the [Rcpp Gallery](https://gallery.rcpp.org/) has several posts. So in short: there is not one succinct code chunk that does this. Which is why Rcpp has tens of thousands of lines of code. – Dirk Eddelbuettel Jan 10 '20 at 01:52