0

According to this answer, you can use a text formula in data.table

set.seed(1)
foo = data.table(var1=sample(1:3,1000,r=T), var2=rnorm(1000),  var3=sample(letters[1:5],1000,replace = T))
var_i="var1"
var_j="var1+0.02*exp(var2)"
var_by="var3"
eval_text_formula <- function(s, env) eval(parse(text=s), envir = env, enclos = parent.frame())
foo[var1 == 1, sum(eval_text_formula(var_j, .SD)), by = var3]


   var3       V1
1:    d 72.74060
2:    e 77.10872
3:    c 69.48776
4:    b 84.22668
5:    a 58.53409

I'd like extend this answering what's happen if I pass a formula object

var_j=as.formula("~var1+0.02*exp(var2)")

> foo[var1 == 1, sum(eval_text_formula(var_j, .SD)), by = var3]
Error in sum(eval_text_formula(var_j, .SD)) : 
  'type' (language) de argumento no vĂ¡lido

The sloppy and gross solution I found is reconvert the formula to string var_j=as.character(var_j).

> var_j=as.formula("~var1+0.02*exp(var2)")
> var_j=as.character(var_j)
> var_j
[1] "~"                       "var1 + 0.02 * exp(var2)"
> var_j=var_j[2]
> foo[var1 == 1, sum(eval_text_formula(var_j, .SD)), by = var3]
   var3       V1
1:    d 72.74060
2:    e 77.10872
3:    c 69.48776
4:    b 84.22668
5:    a 58.53409

I feel there is a concept I am missing.

Captain Tyler
  • 500
  • 7
  • 19

2 Answers2

1

An option is to convert to character and remove the ~

var_j2 <- sub("~", "", deparse(var_j))
foo[var1 == 1, sum(eval_text_formula(var_j2, .SD)), by = var3]
#   var3       V1
#1:    d 72.74060
#2:    e 77.10872
#3:    c 69.48776
#4:    b 84.22668
#5:    a 58.53409

Or split the formula into a list with as.list and select the second component, convert to character with deparse

foo[var1 == 1, sum(eval_text_formula(deparse(as.list(var_j)[[2]]), .SD)), by = var3]
#   var3       V1
#1:    d 72.74060
#2:    e 77.10872
#3:    c 69.48776
#4:    b 84.22668
#5:    a 58.53409

With tidyverse, the formula can be converted to quosure with as_quosure (from rlang) and evaluated (!!)

library(rlang)
library(dplyr)
foo %>% 
   filter(var1 == 1) %>%
   group_by(var3) %>%
   summarise(val = sum(!! as_quosure(var_j)))
# A tibble: 5 x 2
#   var3    val
#* <chr> <dbl>
#1 a      58.5
#2 b      84.2
#3 c      69.5
#4 d      72.7
#5 e      77.1
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Seems more secure catch the RHS side of the formula using

var_j2 <- tail(as.character(f), 1)

Altough we're coercing the formula object again.

Captain Tyler
  • 500
  • 7
  • 19