Basic cross validation:
from sklearn.model_selection import cross_val_score
from sklearn import datasets
X, y = datasets.load_iris(return_X_y=True)
clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, X, y, cv=5)
Suppose there is another data X2
andy2
which I would like to concatenate with X
and y
but I don't want to participate it in cross validation.(In all 5 folds X2
and y2
should be a part of training).
Is it still possible to use cross_val_score
from scikit-learn to do so?
In another words, is partial cross validation possible in cross_val_score
where a part of data always remains in training set?
P.S: X2
and y2
are actually synthesized complementary data which I would like to know weather their presence help the model to perform better or not. So for fair comparison they shouldn't be a part of testing.