I am trying different ML models, all using a pipeline which includes a transformer and an algorithm, 'nested' in a GridSearchCV to find the best hyperparameters.
When running Ridge, Lasso and ElasticNet regressions, I would like to store all the computed coefficients, not only the best_estimator_
coefficients, in order to plot them according to the alpha
's path.
In other words, when the GridSearchCV changes the alpha
parameter and fit a new model, I would like to store the resulting coefficients, to plot them against the alpha
values.
You can take a look at this official scikit post for a beautiful example.
This is my code:
from sklearn.linear_model import Ridge
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_absolute_error, mean_squared_error
import time
start = time.time()
# Cross-validated - Ridge Regression
model_ridge = make_pipeline(transformer, Ridge()) # my transformer is already defined
alphas = np.logspace(-5, 5, num = 50)
params = {'ridge__alpha' : alphas}
grid = GridSearchCV(model_ridge, param_grid = params, cv=10)
grid.fit(X_train, y_train)
regressor = grid.estimator.named_steps['ridge'].coef_ # when I add this line, it returns an error
stop = time.time()
training_time = stop-start
y_pred = grid.predict(X_test)
Ridge_Regression_results = {'Algorithm' : 'Ridge Regression',
'R²' : grid.score(X_train, y_train),
'MAE' : mean_absolute_error(y_test, y_pred),
'RMSE' : np.sqrt(mean_squared_error(y_test, y_pred)),
'Training time (sec)' : training_time}
In this topic: return coefficients from Pipeline object in sklearn, the author was adviced to use the named_steps
attribute of the pipeline.
But in my case, when I try to use it, it returns the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18260/3310195105.py in <module>
13
14 grid.fit(X_train, y_train)
---> 15 regressor = grid.estimator.named_steps['ridge'].coef_
16
17
AttributeError: 'Ridge' object has no attribute 'coef_'
I don't understand why this is happening.
For this to work, my guess is that this storing should happen during the GridSearchCV loop, but I can't figure out how to do this.