3

I know that somewhere there will exist this kind of question, but I couldn't find it. I have the variables a, b, c, d and I want to write a loop, such that I regress and append the variables and regress again with the additional variable

lm(Y ~ a, data = data), then lm(Y ~ a + b, data = data), then

lm(Y ~ a + b + c, data = data) etc.

How would you do this?

Textime
  • 89
  • 8
  • Related post: https://stackoverflow.com/questions/22955617/linear-models-in-r-with-different-combinations-of-variables – zx8754 Apr 16 '19 at 11:16
  • The problem is that in that post, he is seeking for all possible combinations. What I need is just to keep that simple flow. First a, then a and b, then a and b and c and at last a and b and c and d. – Textime Apr 16 '19 at 11:19
  • 1
    Better to ask as a new question in a separate post, and link to this one. – zx8754 Apr 16 '19 at 13:08
  • Alright, I'll follow your advice. – Textime Apr 16 '19 at 13:09
  • And please consider accepting one of the answers so we can have this question as solved. – zx8754 Apr 16 '19 at 13:10
  • 1
    Perhaps you need `?add1` function? It essentially does what you're trying to do. – Roman Luštrik Apr 16 '19 at 13:19

3 Answers3

3

Using paste and as.formula, example using mtcars dataset:

myFits <- lapply(2:ncol(mtcars), function(i){
  x <- as.formula(paste("mpg", 
                        paste(colnames(mtcars)[2:i], collapse = "+"), 
                        sep = "~"))
  lm(formula = x, data = mtcars)
})

Note: looks like a duplicate post, I have seen a better solution for this type of questions, cannot find at the moment.

zx8754
  • 52,746
  • 12
  • 114
  • 209
3

You could do this with a lapply / reformulate approach.

formulae <- lapply(ivars, function(x) reformulate(x, response="Y"))
lapply(formulae, function(x) summary(do.call("lm", list(x, quote(dat)))))

Data

set.seed(42)
dat <- data.frame(matrix(rnorm(80), 20, 4, dimnames=list(NULL, c("Y", letters[1:3]))))
ivars <- sapply(1:3, function(x) letters[1:x])  # create an example vector ov indep. variables
jay.sf
  • 60,139
  • 8
  • 53
  • 110
2
vars = c('a', 'b', 'c', 'd')
# might want to use a subset of names(data) instead of
# manually typing the names

reg_list = list()
for (i in seq_along(vars)) {
  my_formula = as.formula(sprintf('Y ~ %s', paste(vars[1:i], collapse = " + ")))
  reg_list[[i]] = lm(my_formula, data = data)
}

You can then inspect an individual result with, e.g., summary(reg_list[[2]]) (for the 2nd one).

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294