TPOT: Pickling Error When Using TPOTRegressor

Question

I have a DataFrame called X and a set of target values called Y.

For most of my models, I do something like this (just an example):

from sklearn.linear_model import LassoCV
clf = LassoCV()
score = cross_val_score(estimator = clf, X = X, y = Y, cv = KFold(n_splits = 3, random_state = 100), n_jobs = -1, \
                        scoring = "neg_mean_squared_error")
np.mean([np.sqrt(-x) for x in score])

I'm trying to use TPOT in a similar way, as follows:

from tpot import TPOTRegressor
tpot = TPOTRegressor(generations=20, population_size=100, verbosity=2)

score = cross_val_score(estimator = tpot, X = X, y = Y, cv = KFold(n_splits = 3, random_state = 100), n_jobs = -1, \
                        scoring = "neg_mean_squared_error")
np.mean([np.sqrt(-x) for x in score])

TPOT starts up but then gives me a pickling error as follows:

PicklingError: Can't pickle <type 'instancemethod'>: it's not found as __builtin__.instancemethod

Any idea why this is happening / how to get TPOT to play nicely?

Thanks!

what about clf =TPOTClassifier(generations=5, population_size=20, cv=5, random_state=42, verbosity=2) instead of using regression.then using clf.score(X_test, y_test) — Mr_U4913, Jun 28 '17 at 22:23
@Mr_U4913 I should be using TPOTRegressor, I believe, since it's a regression problem — anon_swe, Jun 30 '17 at 17:50

score 1 · Answer 1 · answered Jul 10 '17 at 10:31

1

If you are using Python 2, try:

import dill

So that lambda functions can be pickled.... Worked for me...

in Python 3, you might need:

import dill as pickle

answered Jul 10 '17 at 10:31

ntg

12,950
7
74
95

this did not worked for me: still say: PicklingError: Can't pickle : it's not found as tpot.operator_utils.GradientBoostingRegressor__alpha – Dror Hilman Jul 25 '17 at 10:52

score 0 · Answer 2 · answered Jul 25 '17 at 10:58

Try using: tpot.fitted_pipeline_

from tpot import TPOTRegressor
tpot = TPOTRegressor(generations=20, population_size=100, verbosity=2)

score = cross_val_score(estimator = tpot.fitted_pipeline_, X = X, y = Y, cv = KFold(n_splits = 3, random_state = 100), n_jobs = -1, \
                        scoring = "neg_mean_squared_error")
np.mean([np.sqrt(-x) for x in score])

TPOT: Pickling Error When Using TPOTRegressor

2 Answers2