19

I would like to run a regression within a data.table. The formula needs to be constructed dynamically. I have tried the following method:

x = data.table(a=1:20, b=20:1, id=1:5)
> x[,as.list(coef(lm(as.formula("a ~ b")))),by=id]
  Error in eval(expr, envir, enclos) : object 'a' not found

How does one specify the environment to be that of the actual data.table where the evaluation occurs?

EDIT: I realize I can do lm(a ~ b). I need the formula to be dynamic so it's built up as a character string. By dynamically I mean the formula can be paste0(var_1, "~", var_2) where var_1 = a and var_2 = b

Here is one solution thought I think we can do better:

txt = parse(text="as.list(coef(lm(a ~ b)))")
> x[,eval(txt),by=id]
  id (Intercept)  b
  1:  1          21 -1
  2:  2          21 -1
  3:  3          21 -1
  4:  4          21 -1
  5:  5          21 -1
Alex
  • 19,533
  • 37
  • 126
  • 195
  • I think this is just a duplicate of http://stackoverflow.com/questions/14721592/r-dynamically-build-list-in-data-table-or-ddply/14721921#14721921 . Not voting to close yet because I think you need to explain and illustrate better what you mean by "build dynamically". – IRTFM Feb 09 '13 at 01:59
  • 1
    will read through, didn't see that one but i don't think it's a duplicate. in particular how does one get the handle on the environment within the actual data.table? – Alex Feb 09 '13 at 02:00

1 Answers1

16

lm can accept a character string as the formula so combine that with .SD like this:

> x[, as.list(coef(lm("a ~ b", .SD))), by = id]
   id (Intercept)  b
1:  1          21 -1
2:  2          21 -1
3:  3          21 -1
4:  4          21 -1
5:  5          21 -1
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341