4

I have a DataFrame called X and a set of target values called Y.

For most of my models, I do something like this (just an example):

from sklearn.linear_model import LassoCV
clf = LassoCV()
score = cross_val_score(estimator = clf, X = X, y = Y, cv = KFold(n_splits = 3, random_state = 100), n_jobs = -1, \
                        scoring = "neg_mean_squared_error")
np.mean([np.sqrt(-x) for x in score])

I'm trying to use TPOT in a similar way, as follows:

from tpot import TPOTRegressor
tpot = TPOTRegressor(generations=20, population_size=100, verbosity=2)

score = cross_val_score(estimator = tpot, X = X, y = Y, cv = KFold(n_splits = 3, random_state = 100), n_jobs = -1, \
                        scoring = "neg_mean_squared_error")
np.mean([np.sqrt(-x) for x in score])

TPOT starts up but then gives me a pickling error as follows:

PicklingError: Can't pickle <type 'instancemethod'>: it's not found as __builtin__.instancemethod

Any idea why this is happening / how to get TPOT to play nicely?

Thanks!

stop-cran
  • 4,229
  • 2
  • 30
  • 47
anon_swe
  • 8,791
  • 24
  • 85
  • 145
  • what about clf =TPOTClassifier(generations=5, population_size=20, cv=5, random_state=42, verbosity=2) instead of using regression.then using clf.score(X_test, y_test) – Mr_U4913 Jun 28 '17 at 22:23
  • @Mr_U4913 I should be using TPOTRegressor, I believe, since it's a regression problem – anon_swe Jun 30 '17 at 17:50

2 Answers2

1

If you are using Python 2, try:

import dill  

So that lambda functions can be pickled.... Worked for me...

in Python 3, you might need:

import dill as pickle
ntg
  • 12,950
  • 7
  • 74
  • 95
  • this did not worked for me: still say: PicklingError: Can't pickle : it's not found as tpot.operator_utils.GradientBoostingRegressor__alpha – Dror Hilman Jul 25 '17 at 10:52
0

Try using: tpot.fitted_pipeline_

from tpot import TPOTRegressor
tpot = TPOTRegressor(generations=20, population_size=100, verbosity=2)

score = cross_val_score(estimator = tpot.fitted_pipeline_, X = X, y = Y, cv = KFold(n_splits = 3, random_state = 100), n_jobs = -1, \
                        scoring = "neg_mean_squared_error")
np.mean([np.sqrt(-x) for x in score])
Dror Hilman
  • 6,837
  • 9
  • 39
  • 56