I am training different models for a regression problem. Since i want to find the best model between the choices, i wanted to perform a cross validation with k = 20, to characterize the MSE of the models, and statistically determine what model is the better between them. The problem has got multiple dependant variables, and i would like to determinate the MSE separately for both dependant variables, but cross_val_score doesnt let me do that explicitely. Here is some example code of one of my models:
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.metrics import mean_squared_error
model = LinearRegression()
model.fit(x, y)
y_pred = model.predict(x_test)
mse = mean_squared_error(scaler2.inverse_transform(y_test), scaler2.inverse_transform(y_pred), multioutput="raw_values")
How can i iterate training on the k times corresponding to the k models trained and tested in a k fold cross validation? Scikit provides a Kfold but it is just a way to specify the number of folds, and it doesnt actually returns the training and test folds, so i can't think a way to actually train different models using kfold cross validation theory. Plus, i would need to evaluate MSE seprately on each dependant variable since it's a multiple regression problem