0

I was trying to to feature selection with RFECV:

X_train, X_test, y_train, y_test = train_test_split(X_matrix,
                                                    y, test_size=0.2, random_state=42)

selector = RFECV(DecisionTreeRegressor(), min_features_to_select=5,
                  step=5, cv=2, n_jobs=-1)
selector.fit(X_train ,y_train)


print(selector.support_)

The printed output is a mask for feature selection, however every execution it returns a different mask. The X and y dataframe are ok.

Outputs:

1

[ True True True True True False False False True True False True True True True True False True True False True True True True True True True True True True False True True True True True True True True True True False False True False True True]

2

[ True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True True]

3

[False True False False True False False False True True False False False False False True False True False False False False False False False False True False False False False False False False False True True True True True False False False False False False False]

and so on...

What might be the cause?

Community
  • 1
  • 1
Mario L
  • 507
  • 1
  • 6
  • 15
  • `DecisionTreeRegressor` also have a `random_state` param which should be set to get reproducible results. See this: https://stackoverflow.com/a/39158831/3374996 – Vivek Kumar Jan 28 '19 at 10:20
  • @VivekKumar you are right, but I find it still weird as the number of features which is passed should not depend on the regressor? – Mario L Jan 28 '19 at 10:26
  • @VivekKumar and @Mario L - How do we set `n_features` in RFECV? I don't wish to assign `min_no_of_features_to_select`. I would like to select `Top 15 features`. How can I do that? – The Great Dec 13 '19 at 14:01
  • 1
    @SSMK You set `step` as "total_features - 15" and leave `min_no_of_features_to_select` as default. – Vivek Kumar Dec 13 '19 at 16:52
  • Can you help me with this if you have time? https://datascience.stackexchange.com/questions/64906/how-to-transform-specific-type-feature-to-yield-better-prediction – The Great Dec 16 '19 at 08:17

0 Answers0