I've spent a lot of time trying to understand how train
function works and I still don't think I understand how it works. I am training a neuralnet
using train
function to predict times table.
When I plot the model after training I get the network below:
In the plot, it says Error = 0.01643
and I just realized that whenever I train
and plot my finalModel
, the Error
value in the plot always happens to be the Error
value from the last output message that I get after train
function returns.
So I was wondering whether the last model was chosen or not because the last model didn't have the lowest RMSE
. I am assuming that those output messages are in fact in order, in other words, the first output hidden: 8 thresh: 0.01 rep: 1/1 steps: 10304 error: 0.01592 time: 1.75 secs
is from Fold01.Rep1
and the last is from FoldK.RepN
, in my case Fold10.Rep3
.
I thought I could do this crosscheck re-calculating RMSE
s (since this is the deciding metric of the finalModel
) for each hold-out and then compare the results with tt.cv$resamples$RMSE
. So I would see a row where my re-calculation of RMSE
and tt.cv$resample$RMSE
would be equal which would mean that, that particular model was chosen as my finalModel
.
However, when I've re-calculated all the RMSE
s for each hold-out, I didn't see any rows where my calculation and tt.cv$resample$RMSE
were equal. Below you can see the comparison:
RMSE
column has the actual RMSE
values that I got from tt.cv$resamples$RMSE
and rmse_ho
column has the RMSE
values that I've re-calculated using finalModel
.
Could you please point out whether I've any mistakes or not and how can I crosscheck the models elected?
While I am creating this post I've just realized that there are 31 models actually if I count the outputs. I'd read about it somewhere on the internet but couldn't find it again now.
Here is the code:
library(caret)
library(neuralnet)
# Create the dataset
tt <- data.frame(multiplier = rep(1:10, times = 10), multiplicand = rep(1:10, each = 10))
tt <- cbind(tt, data.frame(product = tt$multiplier * tt$multiplicand))
# Splitting
indexes <- createDataPartition(tt$product,
times = 1,
p = 0.7,
list = FALSE)
tt.train <- tt[indexes,]
tt.test <- tt[-indexes,]
# Pre-process
preProc <- preProcess(tt, method = c('center', 'scale'))
tt.preProcessed <- predict(preProc, tt)
tt.preProcessed.train <- tt.preProcessed[indexes,]
tt.preProcessed.test <- tt.preProcessed[-indexes,]
# Train
train.control <- trainControl(method = "repeatedcv",
number = 10,
repeats = 3,
savePredictions = TRUE)
tune.grid <- expand.grid(layer1 = 8,
layer2 = 0,
layer3 = 0)
# Setting seed for reproducibility & train
set.seed(12)
tt.cv <- train(product ~ .,
data = tt.preProcessed.train,
method = 'neuralnet',
tuneGrid = tune.grid,
trControl = train.control,
algorithm = 'backprop',
learningrate = 0.005,
stepmax = 100000,
lifesign = 'minimal',
threshold = 0.01)
errors = data.frame(rmse_ho=numeric(), resample=character())
# Re-calculate RMSE values for each hold-out using tt.cv$finalModel
for(i in levels(as.factor(tt.cv$pred$Resample))) {
dframe = tt.cv$pred[tt.cv$pred$Resample == as.character(i),]
hold_outs = tt.preProcessed.train[dframe$rowIndex,]
prediction_hold_outs = predict.train(tt.cv, hold_outs)
rmse_hold_out = RMSE(prediction_hold_outs, hold_outs$product)
errors = rbind(errors, data.frame(rmse_ho = rmse_hold_out,
# sse_all = sse_all,
# sse_train = sse_train_data,
resample = i))
}
# View the comparison
View(cbind( tt.cv$resample[order(tt.cv$resample$Resample),], errors))