9

Based on Recursive feature elimination and grid search using scikit-learn, I know that RFECV can be combined with GridSearchCV to obtain better parameter setting for the model like linear SVM.

As said in the answer, there are two ways:

  • "Run GridSearchCV on RFECV, which will result in splitting the data into folds two times (ones inside GridSearchCV and once inside RFECV), but the search over the number of components will be efficient."

  • "Do GridSearchCV just on RFE, which would result in a single splitting of the data, but in very inefficient scanning of the parameters of the RFE estimator."

To make my question clear, I have to firstly clarify RFECV:

  1. Split the whole data into n folds.

  2. In every fold, obtain the feature rank by fitting only the training data to rfe.

  3. Sort the ranking and fit the training data to SVM and test it on testing data for scoring. This should be done m times, each with decreasing number of features, where m is the number of features assuming step=1.

  4. A sequence of scores is obtained in the previous step and such sequence would be lastly averaged across n folds after step 1~3 have been done in n times, resulting in an averaged scoring sequence suggesting the best number of features to do in rfe.

  5. Take that best number of features as the argument of n_features_to_select in rfe fitted with the original whole data.

  6. .support_ to get the "winners" among features; .grid_scores_ to get the averaged scoring sequence.

  7. Please correct me if I am wrong, thank you.

So my question is where to put GridSearchCV? I guess the second way "do GridSearchCV just on RFE" is do GridSearchCV on step 5 which sets the parameter of SVM to one of the value in the grid, fit it on training data split by GridSearchCV to obtain the number of features suggested in step 4, and test it with the rest of the data for the score. Such process is done in k times and an averaged score indicates the goodness of that value in the grid, where k is the argument cv in GridSearchCV. However, selected features might be different due to alternative training data and grid value, which makes this second way not reasonable if it is done as my guess.

How actually does GridSearchCV be combined with RFECV?

Community
  • 1
  • 1
Francis
  • 6,416
  • 5
  • 24
  • 32
  • also see http://stackoverflow.com/questions/29538292/doing-hyperparameter-estimation-for-the-estimator-in-each-fold-of-recursive-feat – Andreas Mueller Apr 23 '15 at 01:09
  • 1
    This is also relevant: https://stats.stackexchange.com/questions/264533/how-should-feature-selection-and-hyperparameter-optimization-be-ordered-in-the-m – mirgee Dec 30 '18 at 12:00

0 Answers0