You intend to use "ROC" - area under the ROC curve to pick the best tuning parameters but you do not specify twoClassSummary()
which holds this metric. This is what the warning is
informing you
Warning message:
In train.default(x, y, weights = w, ...) :
The metric "ROC" was not in the result set. Accuracy will be used instead.
Perform turning:
library(tidyverse)
library(caret)
library(glmnet)
library(mlbench)
data(PimaIndiansDiabetes, package="mlbench")
set.seed(2323)
train.data <- PimaIndiansDiabetes
set.seed(2323)
model <- train(
diabetes ~., data = train.data, method = "glmnet",
trControl = trainControl("cv",
number = 10,
classProbs = TRUE,
savePredictions = TRUE,
summaryFunction = twoClassSummary),
tuneLength = 10,
metric="ROC" #ROC metric is in twoClassSummary
)
Since you specified classProbs = TRUE
and savePredictions = TRUE
you can calculate any metric based on the predictions.
The calculate accuracy:
model$pred %>%
filter(alpha == model$bestTune$alpha, #filter predictions for best tuning parameters
lambda == model$bestTune$lambda) %>%
group_by(Resample) %>% #group by fold
summarise(acc = sum(pred == obs)/n()) #calculate metric
#output
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 10 x 2
Resample acc
<chr> <dbl>
1 Fold01 0.740
2 Fold02 0.753
3 Fold03 0.818
4 Fold04 0.776
5 Fold05 0.779
6 Fold06 0.753
7 Fold07 0.766
8 Fold08 0.792
9 Fold09 0.727
10 Fold10 0.789
This gives you per fold metric. To get the average performance
model$pred %>%
filter(alpha == model$bestTune$alpha,
lambda == model$bestTune$lambda) %>%
group_by(Resample) %>%
summarise(acc = sum(pred == obs)/n()) %>%
pull(acc) %>%
mean
#output
0.769566
When ROC is used as a selection metric the hyper parameters are optimized over all decision thresholds. In many cases the chosen model would preform suboptimal using the default decision threshold of 0.5.
Caret has a function thresholder()
it will calculate many metrics based on the resampled data over specified decision thresholds.
thresholder(model, seq(0, 1, length.out = 10)) #in reality I would use length.out = 100
#output
alpha lambda prob_threshold Sensitivity Specificity Pos Pred Value Neg Pred Value Precision Recall F1 Prevalence Detection Rate Detection Prevalence Balanced Accuracy Accuracy
1 0.1 0.03607775 0.0000000 1.000 0.00000000 0.6510595 NaN 0.6510595 1.000 0.7886514 0.6510595 0.6510595 1.0000000 0.5000000 0.6510595
2 0.1 0.03607775 0.1111111 0.994 0.02621083 0.6557464 0.7380952 0.6557464 0.994 0.7901580 0.6510595 0.6471463 0.9869617 0.5101054 0.6562714
3 0.1 0.03607775 0.2222222 0.986 0.15270655 0.6850874 0.8711111 0.6850874 0.986 0.8082906 0.6510595 0.6419344 0.9375256 0.5693533 0.6952837
4 0.1 0.03607775 0.3333333 0.964 0.32421652 0.7278778 0.8406807 0.7278778 0.964 0.8290127 0.6510595 0.6276316 0.8633459 0.6441083 0.7408578
5 0.1 0.03607775 0.4444444 0.928 0.47364672 0.7674158 0.7903159 0.7674158 0.928 0.8395895 0.6510595 0.6041866 0.7877990 0.7008234 0.7695147
6 0.1 0.03607775 0.5555556 0.862 0.59002849 0.7970454 0.7053968 0.7970454 0.862 0.8274687 0.6510595 0.5611928 0.7043575 0.7260142 0.7669686
7 0.1 0.03607775 0.6666667 0.742 0.75740741 0.8521972 0.6114289 0.8521972 0.742 0.7926993 0.6510595 0.4830827 0.5677204 0.7497037 0.7473855
8 0.1 0.03607775 0.7777778 0.536 0.90284900 0.9156149 0.5113452 0.9156149 0.536 0.6739140 0.6510595 0.3489918 0.3828606 0.7194245 0.6640636
9 0.1 0.03607775 0.8888889 0.198 0.98119658 0.9573810 0.3967404 0.9573810 0.198 0.3231917 0.6510595 0.1289474 0.1354751 0.5895983 0.4713602
10 0.1 0.03607775 1.0000000 0.000 1.00000000 NaN 0.3489405 NaN 0.000 NaN 0.6510595 0.0000000 0.0000000 0.5000000 0.3489405
Kappa J Dist
1 0.0000000 0.00000000 1.0000000
2 0.0258717 0.02021083 0.9738516
3 0.1699809 0.13870655 0.8475624
4 0.3337322 0.28821652 0.6774055
5 0.4417759 0.40164672 0.5329805
6 0.4692998 0.45202849 0.4363768
7 0.4727251 0.49940741 0.3580090
8 0.3726156 0.43884900 0.4785352
9 0.1342372 0.17919658 0.8026597
10 0.0000000 0.00000000 1.0000000
Now pick a threshold based on your desired metric and use that. Usually the metrics used with imbalanced data Cohen's Kappa, Youden's J or Matthews correlation coefficient (MCC). Here is a decent paper on the matter.
Please note that since this data was used to find the optimal threshold the performance obtained this way will be optimistically biased. To evaluate the performance of the picked decision threshold it would be best to use several independent test sets. In other words I recommend nested resampling where you would optimize the parameters and threshold using the inner folds and evaluate on the outer folds.
Here is an explanation on how to use nested resampling with caret with regression. Some modifications are needed to make it work with classification with optimized threshold.
Please note that this is not the only way to pick the best decision threshold. Another way is to pick the desired metric a priori (MCC for instance) and treat the decision threshold as a hyper parameter which is to be tuned jointly with all the other hyper parameters. I trust this is not supported with caret with creating custom models.