2

I've done a fair amount of reading here on SO and learned that I should generally avoid manipulation of formula objects as strings, but I haven't quite found how to do this in a safe manner:

tf <- function(formula = NULL, data = NULL, groups = NULL, ...) {
# Arguments are unquoted and in the typical form for lm etc
# Do some plotting with lattice using formula & groups (works, not shown)
# Append 'groups' to 'formula':
# Change y ~ x as passed in argument 'formula' to
# y ~ x * gr where gr is the argument 'groups' with
# scoping so it will be understood by aov
new_formula <- y ~ x * gr
# Now do some anova (could do if formula were right)
model <- aov(formula = new_formula, data = data)
# And print the aov table on the plot (can do)
print(summary(model)) # this will do for testing
}

Perhaps the closest I came was to use reformulate but that only gives + on the RHS, not *. I want to use the function like this:

p <- tf(carat ~ color, groups = clarity, data = diamonds)

and have the aov results for carat ~ color * clarity. Thanks in Advance.

Solution

Here is a working version based on @Aaron's comment which demonstrates what's happening:

tf <- function(formula = NULL, data = NULL, groups = NULL, ...) {
print(deparse(substitute(groups)))
f <- paste(".~.*", deparse(substitute(groups)))
new_formula <- update.formula(formula, f)
print(new_formula)
model <- aov(formula = new_formula, data = data)
print(summary(model))
}
Bryan Hanson
  • 6,055
  • 4
  • 41
  • 78

2 Answers2

3

I think update.formula can solve your problem, but I've had trouble with update within function calls. It will work as I've coded it below, but note that I'm passing the column to group, not the variable name. You then add that column to the function dataset, then update works.

I also don't know if it's doing exactly what you want in the second equation, but take a look at the help file for update.formula and mess around with it a bit.

http://stat.ethz.ch/R-manual/R-devel/library/stats/html/update.formula.html

tf <- function(formula,groups,d){
  d$groups=groups
  newForm = update(formula,~.*groups)
  mod = lm(newForm,data=d)
}

dat  = data.frame(carat=rnorm(10,0,1),color=rnorm(10,0,1),color2=rnorm(10,0,1),clarity=rnorm(10,0,1))
m = tf(carat~color,dat$clarity,d=dat)
m2 = tf(carat~color+color2,dat$clarity,d=dat)

tf2 <- function(formula, group, d) {
  f <- paste(".~.*", deparse(substitute(group)))
  newForm <- update.formula(formula, f)
  lm(newForm, data=d)
}
mA = tf2(carat~color,clarity,d=dat)
m2A = tf2(carat~color+color2,clarity,d=dat)

EDIT: As @Aaron pointed out, it's deparse and substitute that solve my problem: I've added tf2 as the better option to the code example so you can see how both work.

slammaster
  • 855
  • 6
  • 11
  • Thanks for looking at this @slammaster I think I may have had the same problems with `update.formula` inside a function! For the lattice part of the call, I have to have the groups argument be an unquoted name of something in the data frame, so I can't use 'dat$clarity', I would have to use just 'clarity' as the argument. So the `lm` or `aov` call has to work the same way after appending groups. – Bryan Hanson Feb 12 '13 at 19:25
  • 2
    Try giving update a character string (formatting bad because in a comment, sorry...): tf <- function(formula, group, d) { f <- paste(".~.*", deparse(substitute(group))); lm(update.formula(formula, f), data=d) } – Aaron left Stack Overflow Feb 12 '13 at 19:31
  • @Aaron: Should be posted as an answer. – IRTFM Feb 12 '13 at 19:35
  • @Aaron Please make it an answer so I can accept it. You are more skilled in the dark arts than I! Thanks. Bryan – Bryan Hanson Feb 12 '13 at 20:58
  • Perhaps it should have been an answer, but it felt like a minor tweak to @slammaster's answer, who came up with the basic idea. Perhaps slammaster would be willing to edit to include the tweak, and then you can accept this one? – Aaron left Stack Overflow Feb 12 '13 at 22:00
  • In my mind it is the `deparse(substitute(groups))` that is the key magic that keeps it completely programmatic and general. I'll let it sit a while longer before accepting something. – Bryan Hanson Feb 12 '13 at 23:39
0

One technique I use when I have trouble with scoping and calling functions within functions is to pass the parameters as strings and then construct the call within the function from those strings. Here's what that would look like here.

tf <- function(formula, data, groups) {
  f <- paste(".~.*", groups)
  m <- eval(call("aov", update.formula(as.formula(formula), f), data = as.name(data)))
  summary(m)
}

tf("mpg~vs", "mtcars", "am") 

See this answer to one of my previous questions for another example of this: https://stackoverflow.com/a/7668846/210673.

Also see this answer to the sister question of this one, where I suggest something similar for use with xyplot: https://stackoverflow.com/a/14858661/210673

Community
  • 1
  • 1
Aaron left Stack Overflow
  • 36,704
  • 7
  • 77
  • 142
  • Thanks for the additional suggestions and links. I have working versions of my functions that take quoted names as arguments using methods similar to what you illustrate. For some reason I got it into my head to convert to a more 'official' or slicker formula interface, which led me into this quagmire! I'll emerge more knowledgeable in the long run, but it was much more work than I thought when I began. Thanks again. – Bryan Hanson Feb 13 '13 at 21:11
  • My answer here may also be useful to future searchers: http://stackoverflow.com/a/14940094/210673 – Aaron left Stack Overflow Feb 18 '13 at 15:50