I'm trying to understand how Gridsearchcv's logic works. I looked at here, the official documentation, and the source code, but I couldn't figure out the following:
What is the general logic behind Gridsearchcv?
Clarifications:
- If I use the default cv = 5, what are the % splits of the input data into: train, validation, and test?
- How often does Gridsearchcv perform such a split, and how does it decide which observation belong to train / validation / test?
- Since cross validation is being done, where does any averaging come into play for the hyper parameter tuning? i.e. is the optimal hyper parameter value is one that optimizes some sort of average?
This question here shares my concern, but I don't know how up-to-date the information is and I am not sure I understand all the information there. For example, according to the OP, my understanding is that:
- The test set is 25% of the input data set and is created once.
- The union of the train set and validation set is correspondingly created once and this union is 75% of the original data.
- Then, the procedure creates 5 (because cv = 5) further splits of this 75% into 60% train and 15% validation
- The optimized hyper parameter value is one that optimizes the average of some metric over these 5 splits.
Is this understanding correct and still applicable now? And how does the procedure do the original 25%-75% split?