3

I have a problem including a data.table operation in a function. Input arguments are the data.table name and the column/variable name.

I can refer to the data.table by using the get() command. However, using the same command for the variable name doesn't work out. I know that get() might not be appropriate in case of the column/variable name, but I am stuck with which command to use.

EDITED: I have now included substitute() instead of get() and it still doesn't work.

toy_example_fun <- function(d, .expr){

  .expr = substitute(.expr)

  setkey(get(d), .expr)  # ==> doesn't work

  d.agg <- get(d)[,list(sum(y), sum(v)), by=.expr]  # --> works
}

toy_example_fun("DT", x)

ALTERNATIVE: quote() --> This works. However, I am interested in a solution that works inside a function.

DT <- data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)    
d <- "DT"
variable <- quote(x)
d.agg <- get(d)[,list(sum(y), sum(v)), by=variable]  

Even though, the latter alternative works variable <- quote(x) produces an error message:

  <simpleError in doTryCatch(return(expr), name, parentenv, handler): object 'x' not found>
    <simpleError in is.scalar(val): object 'x' not found>
    <simpleError in is.data.frame(obj): object 'x' not found> 

Thanks for your help.

majom
  • 7,863
  • 7
  • 55
  • 88
  • 1
    possible duplicate of [Using data.table i and j arguments in functions](http://stackoverflow.com/questions/9705488/using-data-table-i-and-j-arguments-in-functions) – Andrie Aug 08 '12 at 07:09
  • Thanks for referring me to your thread, Andrie. However, somehow it still won't work if I use `substitute()`. – majom Aug 08 '12 at 08:00
  • Please edit your question by incorporating the suggestions in the answer to the linked question - otherwise you risk it getting closed as a duplicate. – Andrie Aug 08 '12 at 08:00
  • This is still a duplicate. You should use `quote(x)` as explained in the linked answer. – Andrie Aug 08 '12 at 08:12
  • @Andrie Nice pointer to other question. Strictly not a duplicate because OP seems to want function that accepts character inputs. If all OP wants is to pass column names around to group by (and not language expressions) then just passing character column names to by can be simpler. As usual, many ways to skin the cat. – Matt Dowle Aug 08 '12 at 09:27

1 Answers1

3

Here you go:

someFun <- function(d, .expr){
  group <- substitute(.expr)
  get(d)[,list(sum(y), sum(v)), by=group]
}

someFun("DT", x)
   group V1 V2
1:     a 10  6
2:     b 10 15
3:     c 10 24


someFun("DT", "x")
   x V1 V2
1: a 10  6
2: b 10 15
3: c 10 24

EDIT from Matthew :

+1 to above. And/Or character column names are acceptable to by directly, too :

someFun = function(d, col) {
    get(d)[,list(sum(y),sum(v)),by=col]
}
someFun("DT","x")
   x V1 V2
1: a 10  6
2: b 10 15
3: c 10 24
someFun("DT","x,y")
   x y V1 V2
1: a 1  1  1
2: a 3  3  2
3: a 6  6  3
4: b 1  1  4
5: b 3  3  5
6: b 6  6  6
7: c 1  1  7
8: c 3  3  8
9: c 6  6  9

but then someFun("DT",x) won't work. So Adrie's answer is more general.


EDIT with setkeyv

someFun <- function(d, cols){
  setkeyv(get(d), cols)
  cols <- substitute(cols)
  get(d)[,list(sum(y), sum(v)), by=cols]
}

someFun("DT", "x")
   x V1 V2
1: a 10  6
2: b 10 15
3: c 10 24
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • Thanks, Andrie and Matthew. Your answers helped me a lot. However, in my final function I also need to use the `setkey()` command. While the other part works, the line `setkey(get(d), .expr)` still doesn't work. Actually, that was the reason for my confusion (I never had a look at the aggregation as I have gotten the error from `setkey()` in the first place.) – majom Aug 08 '12 at 09:35
  • @majom Try `?setkeyv` instead (extra v on the end). – Matt Dowle Aug 08 '12 at 09:51
  • Done. Thanks for the hint and thanks for developing such an awesome package! – majom Aug 09 '12 at 07:18