0

I am reading a data set as follows:

data<-read.csv("test.csv",sep=",",header=T)

the first column of test.csv is the response variable. The remaining 20 columns are predictor variables. How can I write the lm formula for this kind of scenario. It kind of not a correct approach to write formula as

modelfit<-lm(data[,1]~data[,2]+data[,3],+... )
user288609
  • 12,465
  • 26
  • 85
  • 127
  • 3
    You can use `lm(y~., data=mydata)` to regress the column `y` in `mydata` against all other columns in `mydata`. If you're going to use the formula syntax, I would stay away from indexing (`[ ]') in the formula. – MrFlick Nov 08 '14 at 05:35

1 Answers1

1

This is how you should do it

data<-read.csv("test.csv",sep=",",header=T)
variables <- colnames(data)
depVar <- variables[1]
indepVars <- variables[-1]
myformulae <- as.formula(paste(depVar,paste(indepVars,collapse=' + '),sep = ' ~ '))
modelfit <-lm(myformulae,data=data)
sayan dasgupta
  • 1,084
  • 6
  • 15