2

I'm using the TPOT classifier, and after training the model, I want to save the best pipeline; I can get it using.

model.fitted_pipeline_

This is an example of one of the outputs:

Pipeline(steps=[('extratreesclassifier',
                  ExtraTreesClassifier(criterion='entropy', max_features=0.1,
                                       min_samples_split=8))])

But when I try to pickle this object using joblib.dump I get this error:

pickle.PicklingError: Can't pickle <class 'tpot.operator_utils.ExtraTreesClassifier__bootstrap'>: it's not found as tpot.operator_utils.ExtraTreesClassifier__bootstrap

So the question is: how can I pickle the trained pipeline? Thanks in advance!

In case this might affect: The training is inside a class and it's called using a train() method, then the pipeline is returned from this method and another one makes the dump. I can't change this structure by a design constrain.

Rodrigo A
  • 657
  • 7
  • 23

1 Answers1

0

Try using the steps attribute in the fitted pipeline.

Below is the code:

model = model.fitted_pipeline.steps[-1][1]
joblib.dump(model, "/path/to/pickle")
High-Octane
  • 1,104
  • 5
  • 19