0

Is it possible to apply the sklearn pipeline for deep learning like this:

clf = Pipeline(
    steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression())]
)

clf.fit(X_train, y_train)

If it's possible, then how can one do it? My code below produces the subsequent error:

def model():
    ann = tf.keras.models.Sequential()
    ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
    ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
    ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
    ann.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return ann

clf = Pipeline(
    steps = [
             ('pre', preprocessor),
             ('ann', model())
    ]
)

clf.fit(X_train, y_train, batch_size = 32, epochs = 100)

Error:

ValueError: Pipeline.fit does not accept the batch_size parameter. You can pass parameters to specific steps of your pipeline using the stepname__parameter format, e.g. Pipeline.fit(X, y, logisticregression__sample_weight=sample_weight).

Kyle F Hartzenberg
  • 2,567
  • 3
  • 6
  • 24
  • Are you wanting [this](https://stackoverflow.com/questions/42415076/how-to-insert-keras-model-into-scikit-learn-pipeline) or [this_2](https://stackoverflow.com/a/55969300/1740577)? – I'mahdi Jun 11 '22 at 11:25
  • I mean the error says quite explicitly what to do, to use `ann__batch_size` and `ann__epochs` – lejlot Jun 11 '22 at 14:15
  • Yes, you can use sklearn pipeline in deep learning. But you may need to use `KerasClassifier` from scikeras.wrappers to implement this. Please check these links- [link1](https://stackoverflow.com/questions/69126555/how-to-log-kerasclassifier-model-in-a-sklearn-pipeline-mlflow), [link2](https://stackoverflow.com/questions/66860294/keras-network-using-scikit-learn-pipeline-resulting-in-valueerror) as reference for the similar issue. –  Nov 21 '22 at 17:01

1 Answers1

0

To access parameters from a named_step in a pipeline you have to use <step_name>__<parameter_name>. For example, in your case, you have to use ann__batch_size and ann__epochs to set the values while using the fit.

dagelf
  • 1,468
  • 1
  • 14
  • 25
  • This issue has been discussed [here](https://adriangb.com/scikeras/stable/migration.html) in detail. – Mario May 09 '23 at 23:33