-1

Hi can any one write help me write a for loop or apply function to run the below code for multiple models

Simulated Data

set.seed(666)
x1 = rnorm(1000) 
x2 = rnorm(1000)
y = rbinom(1000,1,0.8)
df = data.frame(y=as.factor(y),x1=x1,x2=x2)

Splitting Data to train and test sets

dt = sort(sample(nrow(df), nrow(df)*.5, replace = F))
trainset=df[dt,]; testset=df[-dt,]

Fitting logistic regression models

model1=glm( y~x1,data=trainset,family="binomial")
model2=glm( y~x1+x2,data=trainset,family="binomial")

Testing Model accuracy in test and train ets

I want to loop the below mentioned code for multiple models fitted above and print the AUC in train set and test set for each model

require(pROC)
trainpredictions <- predict(object=model1,newdata = trainset); 
trainpredictions <- as.ordered(trainpredictions)
testpredictions <- predict(object=model1,newdata = testset); 
testpredictions <- as.ordered(testpredictions)
trainauc <- roc(trainset$y, trainpredictions); 
testauc <- roc(testset$y, testpredictions)
print(trainauc$auc); print(testauc$auc)
Community
  • 1
  • 1
  • are your models stored in a list? You might want to provide subset of all your models. – acylam Oct 09 '17 at 17:23
  • What exactly do you want the output to be? Do you have any experience writing functions? You could probably use `lapply()` here. It would be easier to help with a proper [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data. – MrFlick Oct 09 '17 at 17:23
  • @MrFlick I edited the question and added a reproducible example; – kishan vuddanda Oct 09 '17 at 18:19

1 Answers1

0

Just put your models in a list

models <- list(
  model1 = glm( y~x1,data=trainset,family="binomial"),
  model2 = glm( y~x1+x2,data=trainset,family="binomial")
)

Define a function for value extraction

getauc <- function(model) {
  trainpredictions <- predict(object=model,newdata = trainset); 
  trainpredictions <- as.ordered(trainpredictions)
  testpredictions <- predict(object=model,newdata = testset); 
  testpredictions <- as.ordered(testpredictions)
  trainauc <- roc(trainset$y, trainpredictions); 
  testauc <- roc(testset$y, testpredictions)
  c(train=trainauc$auc, test=testauc$auc)
}

And sapply() that function to your list

sapply(models, getauc)
#          model1    model2
# train 0.5273818 0.5448066
# test  0.5025038 0.5146211
MrFlick
  • 195,160
  • 17
  • 277
  • 295