I'm following a kernel on Kaggle and came across this code.
#Validation function
n_folds = 5
def rmsle_cv(model):
kf = KFold(n_folds, shuffle=True, random_state=42).get_n_splits(train.values)
rmse= np.sqrt(-cross_val_score(model, train.values, y_train, scoring="neg_mean_squared_error", cv = kf))
return(rmse)
I understand the purpose and use of KFold and the fact that is used in 'cross_val_score'. What I don't get is why 'get_n_split' is used? As far as I am aware it returns the number of iterations used for cross validation i.e. returns a value of 5 in this case. Surely for this line:
rmse= np.sqrt(-cross_val_score(model, train.values, y_train, scoring="neg_mean_squared_error", cv = kf))
cv = 5? This doesn't make any sense to me. Why it even necessary to use get_n_splits if it returns an integer? I thought KFold returns a class whereas get_n_splits
returns an integer.
Anyone can clear my understanding?