How to predict on a new dataset using caretEnsemble package in R?

Question

I am currently using caretEnsemble package in R for combining multiple models trained in caret. I have got the list of final trained models (say model_list) using caretList function from the same package as follows.

    model_list <- caretList(
    x = input_predictors, 
    y = input_labels, 
    metric = 'Accuracy',
    tuneList = list(
        randomForestModel =   caretModelSpec(method='rf', 
                                             tuneLength=1, 
                                             preProcess=c('BoxCox', 'center', 'scale')), 
        ldaModel = caretModelSpec(method='lda', 
                                  tuneLength=1, 
                                  preProcess=c('BoxCox', 'center', 'scale')),
        logisticRegressionModel =  caretModelSpec(method='glm', 
                                                  tuneLength=1, 
                                                  preProcess=c('BoxCox', 'center', 'scale'))
    ), 
    trControl = myTrainControl
)

The train control object I provided was as follows :

    myTrainControl = trainControl(method = "cv", 
                              number = 10, 
                              index=createResample(training_input_data$retinopathy, 10),
                              savePredictions = TRUE, 
                              classProbs = TRUE, 
                              verboseIter = TRUE, 
                              summaryFunction = twoClassSummary)

Now I am training on those list of models as :

ens <- caretEnsemble(model_list)

Applying summary on ens tells me the selected models (out of model_list), weightage allocated to those selected models, out-of-sample AUC values for each of the selected models, and finally in-sample AUC values for ens.

Now I want to compute the performance of ens on other test-data (to get the idea about out-of-sample performance). How would I achieve it?

I am trying it out as :

ensPredictions <- predict(ens, newdata = test_data)

but it's giving me an error as :

Error in `[.data.frame`(out, , obsLevels, drop = FALSE) : 
  undefined columns selected

Please provide a [minimal reproducible example](http://stackoverflow.com/a/5963610/345660) — Zach, Sep 16 '15 at 13:22

score 1 · Answer 1 · answered Sep 15 '15 at 02:32

1

The first thing I'd check if the test set has all the features of your training set.

answered Sep 15 '15 at 02:32

suresh

527
1
5
6

How to predict on a new dataset using caretEnsemble package in R?

1 Answers1