I am using caret
to tune an MLP in a 10-fold CV (repeated 5 times). I would like to obtain the prSummary (F1, Precision, Recall) as well as the standard accuracy and kappa scores in the summary output.
- With the
caret::defaultSummary()
I get the desired Accuracy and Kappa Values however it's missing F1, precision and recall. - With
prSummary()
function is used the opposite is true: missing Kappa and Accuracy. - Is there a way to get both metrics at once? I provided a toy example with the iris dataset and removed one class to get a binary classification problem.
Q2) On a side note: is it advisable to use the seeds
parameter as I did, for reproducibility of Cross-Validation? Because with random sampling seeds my code is probably still not reproducible right?
########################## Info ############################
# Toy Example - F1, Precision & Recall averaged over folds
#
########################## Preparation & Libraries ############################
#load libraries
library("dplyr")
library("ggplot2")
library("mlbench") # for hyperparameter tuning
library("caret") # for hyperparameter tuning
library("tictoc") # for a performance measure
df1 <- iris %>% rename(
Class = Species
) %>% subset(., Class == "versicolor" | Class == "setosa")
df1$Class <- factor(df1$Class)
########################## Caret Preparation ############################
k.folds = 10
df1.seeds = c(rep(list(sample(1:10000,4,replace = T)),50),
sample(1:10000,1,replace = T))
df1.control <- trainControl( # 10 Fold cross validation, repeated 5 times
method="repeatedcv",
number=k.folds,
repeats=5,
classProbs = T,
seeds = df1.seeds,
# summaryFunction=prSummary,
# summaryFunction=prSummary defaultSummary twoClassSummary,
summaryFunction=prSummary,
#savePredictions=T
)
########################## Hyperparametertuning NeuralNet (MLP) ############################
df1.tunegrid <- expand.grid(.size=c(1:(ncol(df1)-1)))
metric <- "Accuracy"
set.seed(1337)
tic("MLP DF1, Hyperparameter Startegy 1: Grid Search")
mlp_df1 <- train(Class~., data=df1, method="mlp", metric=metric, tuneGrid=df1.tunegrid, trControl=df1.control)
toc()
print(mlp_df1)
# plot(mlp_df1)
print(mlp_df1$bestTune)