1

Here is the output that I want from data.table.

library(data.table)
dt_mtcars <- as.data.table(mtcars)

## desired output ----
dt_mtcars[mpg >20
          , .(mean_mpg = mean(mpg)
              ,median_mpg = median(mpg))
          , .(cyl, gear)]

   cyl gear mean_mpg median_mpg
1:   6    4   21.000      21.00
2:   4    4   26.925      25.85
3:   6    3   21.400      21.40
4:   4    3   21.500      21.50
5:   4    5   28.200      28.20

I want to get the output by passing arguments to a function.

processFUN <- function(dt, where, select, group){

  out <- dt[i=eval(parse(text = where))
            ,j=eval(parse(text = select))
            ,by=eval(parse(text = group))]

  return(out)
}

report <- processFUN(dt_mtcars 
                     ,where= "mpg > 20"
                     ,select= ".(mean_mpg = mean(mpg), median_mpg = median(mpg))"
                     ,group= ".(cyl, gear)")

However, I get an error message.

 Error in .(cyl, gear) : could not find function "." 

William Chiu
  • 388
  • 3
  • 19
  • 2
    For calls that aren't quoted strings, this link may be helpful https://stackoverflow.com/questions/57122960/how-to-use-non-standard-evaluation-nse-to-evaluate-arguments-on-data-table/57126150#57126150 – Cole Jul 29 '19 at 12:04
  • 1
    what is the aim of your function? – Cath Jul 29 '19 at 12:04
  • 1
    If you prefer SQL syntax maybe use *sqldf*: `sqldf("select cyl, gear, avg(mpg) mean_mpg, median(mpg) median_mpg from mtcars where mpg > 20 group by cyl, gear")` – zx8754 Jul 29 '19 at 13:11
  • Related: https://stackoverflow.com/a/54797811/1191259 – Frank Jul 30 '19 at 18:24

3 Answers3

4

Just to give you an alternative, if you can/want to use table.express, you can also use strings in many situations:

library(data.table)
library(table.express)

processFUN <- function(dt, where, select, group) {
  dt %>%
    start_expr %>%
    group_by(!!!group, .parse = TRUE) %>%
    where(!!!where, .parse = TRUE) %>%
    transmute(!!!select, .parse = TRUE) %>%
    end_expr
}

processFUN(as.data.table(mtcars),
           "mpg>20",
           c("mean_mpg = mean(mpg)", "median_mpg = median(mpg)"),
           c("cyl", "gear"))
   cyl gear     V1    V2
1:   6    4 21.000 21.00
2:   4    4 26.925 25.85
3:   6    3 21.400 21.40
4:   4    3 21.500 21.50
5:   4    5 28.200 28.20

In the next release, start_expr and end_expr will be optional.

Alexis
  • 4,950
  • 1
  • 18
  • 37
3

Do you really want to pass conditions as strings? If so, one way would be to construct the query together using paste and then use eval(parse... to evaluate it

library(data.table)

processFUN <- function(dt, where, select, group){
    eval(parse(text = paste0(as.character(substitute(dt)), "[", where, ",", 
               select, ",by = ", group, "]")))
}

processFUN(dt_mtcars 
          ,where= "mpg > 20"
          ,select= ".(mean_mpg = mean(mpg), median_mpg = median(mpg))"
          ,group= ".(cyl, gear)")


#   cyl gear mean_mpg median_mpg
#1:   6    4   21.000      21.00
#2:   4    4   26.925      25.85
#3:   6    3   21.400      21.40
#4:   4    3   21.500      21.50
#5:   4    5   28.200      28.20
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Or using eval with substitute:

library(data.table) #Win R-3.5.1 x64 data.table_1.12.2
dt_mtcars <- as.data.table(mtcars)

processFUN <- function(dt, where, select, group) {

    out <- dt[i=eval(substitute(where)), 
        j=eval(substitute(select)), 
        by=eval(substitute(group))]

    return(out)
}

processFUN(dt_mtcars, mpg>20, .(mean_mpg=mean(mpg), median_mpg=median(mpg)), .(cyl, gear))

Some of the earliest references that I can find are

  1. Aggregating sub totals and grand totals with data.table
  2. Using data.table i and j arguments in functions

The old faq 1.6 contains reference to this: http://datatable.r-forge.r-project.org/datatable-faq.pdf

chinsoon12
  • 25,005
  • 4
  • 25
  • 35
  • 1
    Re "using it outside will not work" -- seems to be false. Actually it works fine in j, just not in by=. `select = ".(mean_mpg = mean(mpg), median_mpg = median(mpg))"; dt_mtcars[, eval(parse(text = select))]` – Frank Jul 30 '19 at 18:24