11

Is there a way to pass a data.table objects to c++ functions using Rcpp and/or RcppArmadillo without manually transforming to data.table to a data.frame? In the example below test_rcpp(X2) and test_arma(X2) both fail with c++ exception (unknown reason).

R code

X=data.frame(c(1:100),c(1:100))
X2=data.table(X)
test_rcpp(X)
test_rcpp(X2)
test_arma(X)
test_arma(X2)

c++ functions

NumericMatrix test_rcpp(NumericMatrix X) {
    return(X);
}

mat test_arma(mat X) {
    return(X);
}
user2503795
  • 4,035
  • 2
  • 34
  • 49

4 Answers4

13

Building on top of other answers, here is some example code:

#include <Rcpp.h>
using namespace Rcpp ;

// [[Rcpp::export]]
double do_stuff_with_a_data_table(DataFrame df){
    CharacterVector x = df["x"] ;
    NumericVector   y = df["y"] ;
    IntegerVector   z = df["v"] ;

    /* do whatever with x, y, v */
    double res = sum(y) ;
    return res ;
}

So, as Matthew says, this treats the data.table as a data.frame (aka a Rcpp::DataFrame in Rcpp).

require(data.table)
DT <- data.table(
    x=rep(c("a","b","c"),each=3), 
    y=c(1,3,6), 
    v=1:9)
do_stuff_with_a_data_table( DT ) 
# [1] 30

This completely ignores the internals of the data.table.

Romain Francois
  • 17,432
  • 3
  • 51
  • 77
10

Try passing the data.table as a DataFrame rather than NumericMatrix. It is a data.frame anyway, with the same structure, so you shouldn't need to convert it.

Romain Francois
  • 17,432
  • 3
  • 51
  • 77
Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
6

Rcpp sits on top of native R types encoded as SEXP. This includes eg data.frame or matrix.

data.table is not native, it is an add-on. So someone who wants this (you?) has to write a converter, or provide funding for someone else to write one.

Romain Francois
  • 17,432
  • 3
  • 51
  • 77
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • Can't he just pass it as a data.frame, rather than NumericMatrix? – Matt Dowle Dec 08 '12 at 10:15
  • Yes @MatthewDowle, a Rcpp::DataFrame shaddows an R data frame. Does `data.table` have a C api ? – Romain Francois Dec 08 '12 at 11:07
  • @RomainFrancois. Great. Then in Rcpp he can treat the data.table exactly as you would a DataFrame. It's an identical structure, just with an extra attribute here and there. I haven't done anything special to provide a C api - should I? – Matt Dowle Dec 08 '12 at 13:56
  • To us, that is essentially a list of columns, so yes this should work. It of course doesn't offer any `data.table` hotness at our end which I fear the OP may (naively) expect. – Dirk Eddelbuettel Dec 08 '12 at 14:16
  • Having a C api would allow people to take advantage of `data.table` wizardry without going back to R syntax, perhaps providing a `DataTable` class. Not sure what would be involved, etc ... as I'm not familiar with the internals of `data.table`. – Romain Francois Dec 08 '12 at 14:22
  • 2
    Thanks @RomainFrancois and Dirk. I can see lots of goodness here using as DataFrame in Rcpp already. Happy to investigate DataTable Rcpp class if anyone needs it ... – Matt Dowle Dec 08 '12 at 15:50
  • 3
    If there is indeed a need, why not. I guess this would be something additional as we would not want to depend on each other package. But a joint effort on a `RcppDataTable` package or whatever .. – Romain Francois Dec 08 '12 at 18:14
3

For reference, I think the good thing is to output a list from rcpp as data.table allow update via lists.

Here is a dummy example:

cCode <- 
    '
     DataFrame DT(DTi);
     NumericVector x = DT["x"];
     int N = x.size();
     LogicalVector b(N);
     NumericVector d(N);
     for(int i=0; i<N; i++){
         b[i] = x[i]<=4;
         d[i] = x[i]+1.;
     }
     return Rcpp::List::create(Rcpp::Named("b") = b, Rcpp::Named("d") = d);
    ';

require("data.table");
require("rcpp");
require("inline");
DT <- data.table(x=1:9,y=sample(letters,9)) #declare a data.table
modDataTable <- cxxfunction(signature(DTi="data.frame"), plugin="Rcpp", body=cCode)

DT_add <- modDataTable(DT)  #here we get the list
DT[, names(DT_add):=DT_add] #here we update by reference the data.table
statquant
  • 13,672
  • 21
  • 91
  • 162