I'm trying to create a binary classifier, modelling with caret
to optimize ROC. The method I was attempting was C5.0
and I get the following error and warning:
Error in train.default(x, y, weights = w, ...) :
final tuning parameters could not be determined
In addition: Warning messages:
1: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
There were missing values in resampled performance measures.
2: In train.default(x, y, weights = w, ...) :
missing values found in aggregated results
I had modelled the same training data with C5.0
and caret
earlier but optimizing for Accuracy and not using twoClassSummary in the control, and it ran without error.
My tuning grid and control for ROC run were
c50Grid <- expand.grid(.trials = c(1:9, (1:10)*10),
.model = c("tree", "rules"),
.winnow = c(TRUE, FALSE))
fitTwoClass <- trainControl(
method = "repeatedcv",
number = 5,
repeats = 5,
classProbs=TRUE,
summaryFunction = twoClassSummary
)
During Accuracy run, I omitted classProbs
and summaryFunction
portion of the control.
For the modeling, the command was
fitModel <- train(
Unhappiness ~ .,
data = dnumTrain,
tuneGrid=c50Grid,
method = "C5.0",
trControl = fitTwoClass,
tuneLength = 5,
metric= "ROC"
)
Can anyone advise how to troubleshoot this? Not sure what parameter to be tweaked if any to make this work, while I believe the dataset should be OK (since it ran OK when optimizing for Accuracy).
To reproduce, training set dnumTrain
can be load
ed from the file in this link.