0

I am trying to create a for loop to index thorugh each individual response variable I have and train a model using the train() funciton within the Caret Package. I have about 30 response variable and 43 predictor variables. I can train each model individually but I would like to automate the process and have a for loop run through a model (I would like to eventually upscale to multiple models if possible, i.e. lm, rf, cubist, etc.). I then want to save each model to a dataframe along with R-squared values and RMSE values. The individual models that I currenlty have that will run for me goes as follows, with column 11 being the response variable and column 35-68 being predictor variables.

data_Mg <- subset(data_3, !is.na(Mg))
mg.lm <- train(Mg~., data=data_Mg[,c(11,35:68)], method="lm", trControl=control)
mg.cubist <- train(Mg~., data=data_Mg[,c(11,35:68)], method="cubist", trControl=control)
mg.rf <- train(Mg~., data=data_Mg[,c(11,35:68)], method="rf", trControl=control, na.action = na.roughfix)
max(mg.lm$results$Rsquared)
min(mg.lm$results$RMSE)
max(mg.cubist$results$Rsquared)
min(mg.cubist$results$RMSE)
max(mg.rf$results$Rsquared) #Highest R squared
min(mg.rf$results$RMSE)

This gives me 3 models with everything the relevant information that I need. Now for the for loop. I've only tried the lm model so far for this.

bucket <- list()
for(i in 1:ncol(data_4)) {       # for-loop response variables, need to end it at response variables, rn will run through all variables
  
  data_y<-subset(data_4, !is.na(i))#get rid of NA's in the "i" column
  predictors_i <- colnames(data_4)[i]    # Create vector of predictor names
  predictors_1.1 <- noquote(predictors_i)
  i.lm <- train(predictors_1.1~., data=data_4[,c(i,35:68)], method="lm", trControl=control)
  bucket <- i.lm
  #mod_summaries[[i - 1]] <- summary(lm(y ~ ., data_y[ , c("i.lm", predictors_i)]))

  #data_y <- data_4
}

Below is the error code that I am getting, with Bulk_Densi being the first variable in predictors_1.1. The error code is that variable lengths differ so I originally thought that my issue was that quotes were being added around "Bulk_Densi" but after trying the NoQuote() function I have not gotten anywehre so I am unsure of where I am going wrong.

Error code that I am getting

Please let me know if I can provide any extra info and thanks in advance for the help! I've already tried the info in How to train several models within a loop for and was struggling with that as well.

RW1201
  • 13
  • 3

0 Answers0