I am using a LeaveOutGroupOut CV strategy with TPOTRegressor
from tpot import TPOTRegressor
from sklearn.model_selection import LeaveOneGroupOut
tpot = TPOTRegressor(
config_dict=regressor_config_dict,
generations=100,
population_size=100,
cv=LeaveOneGroupOut(),
verbosity=2,
n_jobs=1)
tpot.fit(XX, yy, groups=groups)
After optimization the best scoring trained pipeline is stored in tpot.fitted_pipeline_
and tpot.fitted_pipeline_.predict(X)
is available.
my question is: what will the fitted pipeline have been trained on? e.g.
- does tpot refit the optimised pipeline using the entire dataset before storing it in
tpot.fitted_pipeline_
? - or will this represent a trained pipeline from the best scoring split during
Additionally, is there a way to access the complete set of trained models corresponding to the set of splits for the winning/optimized pipeline?