2

I have the following pipeline build and what I want to do is obtain the random forest model object that gets built inside the pipeline. The rf is the only initialization and it doesn't have rf.estimators_

     grid_params = [{'bootstrap': [True],
      'min_samples_leaf': [5],
      'n_estimators': [1000],
      'max_features': [8],
      'min_samples_split': [12],
      'max_depth': [60]}]

     rf = RandomForestRegressor()
     grid_search = GridSearchCV(estimator=rf, param_grid=grid_params)

     pipe = PipelineRF(
         [
             ('grid', grid_search)
         ]
     )
     _ = StratifiedKFold(random_state=125)

Is there a way get to call pipe.get_model()?

YaOzI
  • 16,128
  • 9
  • 76
  • 72
add-semi-colons
  • 18,094
  • 55
  • 145
  • 232
  • `pipe.get_params()` ? – Peybae Aug 31 '18 at 22:53
  • 1
    Use `named_steps`. `pipe.named_steps['grid'].best_estimator_` in your case. See [this](https://stackoverflow.com/questions/43856280/return-coefficients-from-pipeline-object-in-sklearn) – Vivek Kumar Sep 01 '18 at 03:27
  • Your case is just inverse of question I linked. In that pipeline was inside grid-search, so we first accessed `best_estimator_` to get the pipeline and then used `named_steps`. In your case, we first use `named_steps` to get the internal grid-search object and then use `best_estimator_` on that to get the best found RF model from grid-search. – Vivek Kumar Sep 01 '18 at 03:33
  • Anyways your scenario here (grid-search inside pipeline) is a little unusual. Why are you using the pipeline above grid-search. Are you doing any other operations on data in the pipeline before sending them to grid-search? If yes, then you need to make sure that it doesnt leak test information into grid-search model or else you will get biased results. – Vivek Kumar Sep 01 '18 at 03:35

0 Answers0