2

Here I want to create a list of formula for a simple linear regression, 2nd and 3rd order polynomial models. I just did this. It could be ok for a few variables but my data is a quite lot variables. So how could I avoid overloading work using another way to do the same thing?

ind1.lm <- lm(dep ~ ind1, data = df)
ind1.qd <- lm(dep ~ poly(ind1, 2, raw = TRUE), data = df)
ind1.cb <- lm(dep ~ poly(ind1, 3, raw = TRUE), data = df)

ind2.lm <- lm(dep ~ ind2, data = datAll)
ind2.qd <- lm(dep ~ poly(ind2, 2, raw = TRUE), data = df)
ind2.cb <- lm(dep ~ poly(ind2, 3, raw = TRUE), data = df)

ind3.lm <- lm(dep ~ ind3, data = df)
ind3.qd <- lm(dep ~ poly(ind3, 2, raw = TRUE), data = df)
ind3.cb <- lm(dep ~ poly(ind3, 3, raw = TRUE), data = df)

formula.list <- list(as.formula(ind1.lm), as.formula(ind1.qd), 
    as.formula(ind1.cb), as.formula(ind2.lm), as.formula(ind2.qd), 
    as.formula(ind2.cb), as.formula(ind3.lm), as.formula(ind3.qd), 
    as.formula(ind3.cb))

 formula.list

Thanks in advance!

R starter
  • 197
  • 12
  • 1
    Here you are creating the model instead of the formula `lm(dep ~ ind1, data = df)` If you want to create the formula, use `paste` – akrun May 26 '19 at 14:31
  • @ akrun. thank you. This is the problem that I don't get to use paste() for polynomial model. – R starter May 26 '19 at 14:36

1 Answers1

5

Define the independent variables and the formats of the formulas and from that we can get the formula strings. Since lm accepts strings as formulas we can then apply over that giving a list of lm objects whose names are the formulas.

ind <- c("ind1", "ind2", "ind3")
fmt <- c("dep ~ %s", 
         "dep ~ poly(%s, 2, raw=TRUE)", 
         "dep ~ poly(%s, 3, raw=TRUE)")

fo.strings <- c(outer(fmt, ind, sprintf))

sapply(fo.strings, lm, data = df, simplify = FALSE)

Questions to SO should include reproducible code and df was omitted from the question but we can run it using the builtin anscombe data frame like this:

fmt <- c("y1~ %s", 
         "y1~ poly(%s, 2, raw=TRUE)", 
         "y1 ~ poly(%s, 2, raw=TRUE)")
ind <- c("x1", "x2", "x3")
fo.strings <- c(outer(fmt, ind, sprintf))
sapply(fo.strings, lm, data = anscombe, simplify = FALSE)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • @ G.Grothendieck. Thank you so much. this is exactly what I need. – R starter May 29 '19 at 13:33
  • @ G.Grothendieck. I wanna make sure if it is possible to plot the 3 models for each independent variable vs dep variable along with their equation and R2 just to compare the models result for each predictor variable? – R starter May 29 '19 at 15:44
  • 1
    https://stackoverflow.com/questions/7549694/adding-regression-line-equation-and-r2-on-graph – G. Grothendieck May 29 '19 at 15:52