0

I need to determine the seed setting for repeatedCV for KNN model using caret package in R.

My train dataset has 12 columns and 1000 rows (column 1 in the binary response and other 11 columns are standardized predictor variables)

How can I correctly determine the seed setting for "repeatedCV" 50-fold and 5- repeats.?

Is the below seed-setting correct?

Can somebody help to understand the correct seed-setting for repeatedCV and LOOCV?

Please see my code below.

set.seed(123)
seeds <- vector(mode = "list", length = 251)
for(i in 1:250) seeds[[i]] <- sample.int(1000, 11) 

## For the last model:
seeds[[251]] <- sample.int(1000, 1)
user3408139
  • 197
  • 1
  • 12

1 Answers1

0

The 11 in sample.int() should be the #values of parameters.
In this case, if you want to evaluate 11 values of K for KNN in each model, then you choose 11. In details, you will have 10 models in one repeat of 10 folod CV to average out. In each of the 10 models, the train() will try 11 values of K.
2 similar questions already have great answers.
Set seed parallel random forest in caret
Fully reproducible parallel models using caret

Guannan Shen
  • 649
  • 8
  • 12