4

I want to use ... to indicate the variables I want to return from a self-defined function for a data.table object. Here is a minimal replicable example:

library(data.table)
d = data.table(mtcars)

getvar = function(...){
  return(d[,.(xyz = mean(hp), ...), cyl])
}

getvar(mpg, cyl, disp)

Error in [.data.table(d, , .(N = .N, ...), cyl) : object 'cyl' not found

What I wish to get is:

d[,.(xyz = mean(hp), mpg, cyl, disp), cyl]

 #    cyl       xyz  mpg cyl  disp
 # 1:   6 122.28571 21.0   6 160.0
 # 2:   6 122.28571 21.0   6 160.0
 # 3:   6 122.28571 21.4   6 258.0
 # 4:   6 122.28571 18.1   6 225.0
 # 5:   6 122.28571 19.2   6 167.6

Anyone can share their solutions?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Miao Cai
  • 902
  • 9
  • 25
  • Related post: https://stackoverflow.com/q/51259346/680068 – zx8754 Nov 05 '19 at 08:07
  • What is the use case for this? And I know this is beside the point, but ```d[, xyz:= mean(hp), cyl]``` would be a simple implementation – Cole Nov 06 '19 at 10:49
  • @Cole If the data to be input is a wide data, then it is necessary to return a data.table with limited number of columns. – Miao Cai Nov 06 '19 at 15:48

3 Answers3

5

A possible solution is using mget in your function wich returns a list and then combining xyz with that with c. The columns that you want to add need to be specified as a character vector to make this work:

getvar = function(...){
  return(d[, c(xyz = mean(hp), mget(...)), cyl])
}

getvar(c("mpg", "cyl", "disp"))

which gives:

> getvar(c("mpg", "cyl", "disp"))
    cyl       xyz  mpg cyl  disp
 1:   6 122.28571 21.0   6 160.0
 2:   6 122.28571 21.0   6 160.0
 3:   6 122.28571 21.4   6 258.0
 4:   6 122.28571 18.1   6 225.0
 5:   6 122.28571 19.2   6 167.6
 6:   6 122.28571 17.8   6 167.6
 7:   6 122.28571 19.7   6 145.0
 8:   4  82.63636 22.8   4 108.0
 9:   4  82.63636 24.4   4 146.7
10:   4  82.63636 22.8   4 140.8
....

Or as an alternative a slight variation of @Rhonak's answer (thx to @zx8754):

getvar = function(...){
  mc <- match.call(expand.dots = FALSE)
  x <- as.character(mc$...)
  d[, c(xyz = mean(hp), mget(x)), cyl]
}

getvar(mpg, cyl, disp)
Jaap
  • 81,064
  • 34
  • 182
  • 193
3

To get this to work without quoting the column names, you'd have to use some non-standard evaluation tactics:

getvar = function(...){
  vars <- substitute(list(xyz = mean(hp), ...))
  return(d[, eval(vars), cyl])
}

getvar(mpg, cyl, disp)
    cyl       xyz  mpg cyl  disp
 1:   6 122.28571 21.0   6 160.0
 2:   6 122.28571 21.0   6 160.0
 3:   6 122.28571 21.4   6 258.0
 4:   6 122.28571 18.1   6 225.0
 5:   6 122.28571 19.2   6 167.6
...etc...
teunbrand
  • 33,645
  • 4
  • 37
  • 63
2

Building up on answer by @Konrad Rudolph here, we can write the function

getvar = function(...){
   dots = match.call(expand.dots = FALSE)$...
   cols = sapply(dots, deparse)
   d[, c(xyz = mean(hp), mget(cols)), cyl]
   #thanks to @Jaap for simplified version
}

getvar(mpg, cyl, disp)
#    cyl    xyz  mpg cyl  disp
# 1:   6 122.29 21.0   6 160.0
# 2:   6 122.29 21.0   6 160.0
# 3:   6 122.29 21.4   6 258.0
# 4:   6 122.29 18.1   6 225.0
# 5:   6 122.29 19.2   6 167.6
# 6:   6 122.29 17.8   6 167.6
#....
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • 1
    `d[, c(xyz = mean(hp), mget(cols)), cyl]` instead of `d[, xyz := mean(hp), cyl][,mget(c("xyz", cols))]`? – Jaap Nov 05 '19 at 08:21