0

I have already checked this question but the answers didn't help.

I am trying to use a preprocessing method such as StandardScaler and Normalizer with Perceptron in GridSearchCV:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, Normalizer
from sklearn.linear_model import Perceptron

param_grid = [{
    'tol': [1e-1, 1e-3, 1e-5],
    'penalty': ['l2', 'l1', 'elasticnet'],
    'eta0': [0.0001, 0.001, 0.01, 0.1, 1.0]
}]

scoring = {
    'AUC-ROC': 'roc_auc',
    'Accuracy': 'accuracy',
    'AUC-PR': 'average_precision'
}

pipe = Pipeline([('scale', StandardScaler()), ('clf', Perceptron())])

search = GridSearchCV(pipe,
                      param_grid,
                      scoring=scoring,
                      refit='AUC-ROC',
                      cv=skf,
                      return_train_score=True)

results = search.fit(Xtrain, ytrain)

When I run the code I get:

ValueError: Invalid parameter class_weight for estimator Pipeline(steps=[('scale', StandardScaler()), ('clf', Perceptron())]). Check the list of available parameters with `estimator.get_params().keys()`.

I think this error is raised as the param_grid provided is not applicable to StandardScaler(). In addition, when I print search.get_params().keys() I get:

dict_keys(['cv', 'error_score', 'estimator__memory', 'estimator__steps', 'estimator__verbose', 'estimator__scale', 'estimator__clf', 'estimator__scale__copy', 'estimator__scale__with_mean', 'estimator__scale__with_std', 'estimator__clf__alpha', 'estimator__clf__class_weight', 'estimator__clf__early_stopping', 'estimator__clf__eta0', 'estimator__clf__fit_intercept', 'estimator__clf__l1_ratio', 'estimator__clf__max_iter', 'estimator__clf__n_iter_no_change', 'estimator__clf__n_jobs', 'estimator__clf__penalty', 'estimator__clf__random_state', 'estimator__clf__shuffle', 'estimator__clf__tol', 'estimator__clf__validation_fraction', 'estimator__clf__verbose', 'estimator__clf__warm_start', 'estimator', 'n_jobs', 'param_grid', 'pre_dispatch', 'refit', 'return_train_score', 'scoring', 'verbose'])

How do I fix it?

plpm
  • 539
  • 3
  • 12
  • Does this answer your question? [How to apply StandardScaler in Pipeline in scikit-learn (sklearn)?](https://stackoverflow.com/questions/51459406/how-to-apply-standardscaler-in-pipeline-in-scikit-learn-sklearn) – Jesus Sono Feb 17 '21 at 20:02
  • @JesusSono It doesn't however, it provides good information, thanks. – plpm Feb 18 '21 at 06:07

1 Answers1

3

You should specify to which transform in the pipeline the param_grid parameters should be applied:

param_grid = [{
    'clf__tol': [1e-1, 1e-3, 1e-5],
    'clf__penalty': ['l2', 'l1', 'elasticnet'],
    'clf__eta0': [0.0001, 0.001, 0.01, 0.1, 1.0]
}]
David M.
  • 4,518
  • 2
  • 20
  • 25