I previously saw a post with code like this:
scalar = StandardScaler()
clf = svm.LinearSVC()
pipeline = Pipeline([('transformer', scalar), ('estimator', clf)])
cv = KFold(n_splits=4)
scores = cross_val_score(pipeline, X, y, cv = cv)
My understanding is that: when we apply scaler, we should use 3 out of the 4 folds to calculate mean and standard deviation, then we apply the mean and standard deviation to all 4 folds.
In the above code, how can I know that Sklearn is following the same strategy? On the other hand, if sklearn is not following the same strategy, which means sklearn would calculate the mean/std from all 4 folds. Would that mean I should not use the above codes?
I do like the above codes because it saves tons of time.