2

I have a data set (mydata) with 1000 records (rows) and 20 variables (columns, x1....x20). The first column is my response variable (y). All data is numeric with no missing values.

This works fine:

fit <- y ~ x2 + x3 + ..... x20, data = mydata); summary(fit)

I am trying to figure out how to avoid typing in all the variable names (i.e. x1 + x2 + x3 etc).

I've tried:

predictors <- mydata[2:20]
fit <- lm(y ~ mydata[ c(2:20) ]  # as well as mydata[2:20] and predictors

Error - invalid type (list) for variable 'predictors'.

Is there a way around this? Thank you for any assistance.

camille
  • 16,432
  • 18
  • 38
  • 60
BDS
  • 35
  • 4
  • Does this answer your question? [How to succinctly write a formula with many variables from a data frame?](https://stackoverflow.com/questions/5251507/how-to-succinctly-write-a-formula-with-many-variables-from-a-data-frame) – camille Nov 25 '19 at 20:58

1 Answers1

1

We can use . to include all the other variables

lm(y~ ., data = mydata)

If there are also columns other than 'x\d+'

lm(y ~ ., data = mydata[c('y', grep("^x\\d+$", names(mydata), value = TRUE))])

Reproducible example with mtcars

lm(mpg ~ ., data = mtcars)
akrun
  • 874,273
  • 37
  • 540
  • 662