0

I need to aggregate a data.table and create a table with counts, means and other statistics for several variables. The format for the output table should always be the same, but I need to aggregate by various methods. How can I set the output columns and aggregate statistics once and use for different by= choices?

# Create data.table
library(data.table)
DT <- data.table(iris)

# This works, but is long and needs to be updated in multiple
# place whenever I update the output format
DT[,list(theCount=.N,
        meanSepalWidth=mean(Sepal.Width),
        meanPetalWidth=mean(Petal.Width)), 
   by=Species]

# This does not work. How could I achieve what I'm trying to do here?
col.list <-  list(theCount=.N,
        meanSepalWidth=mean(Sepal.Width),
        meanPetalWidth=mean(Petal.Width))
DT[,col.list,  by=Species]
Frank
  • 66,179
  • 8
  • 96
  • 180
Magnus
  • 23,900
  • 1
  • 30
  • 28
  • 1
    You can write `col.list = quote(yada yada); DT[, eval(col.list), by=Species]`. This was documented in an earlier version of the data.table FAQ (... not sure where it went) – Frank Jun 19 '17 at 14:14
  • Perhaps, [this answer](https://stackoverflow.com/a/42433456/3817004) is helpful which picks up a [suggestion of Matt Dowles](https://stackoverflow.com/a/12392269/3817004) on how to deal with `eval()`. – Uwe Jun 19 '17 at 14:24

0 Answers0