0

I am using SKlearn GBM predictions for one of my exercises, and for understanding feature importance after fitting on train data, I can easily do it like this in python, since 'fit' method has those

But I would like to know the feature importance on test dataset too, but 'predict' method doesn't have anything like this

from sklearn import ensemble
gbm = ensemble.GradientBoostingRegressor(**params)## 
gbm.fit(X_train, y_train))
# feature importance 
feat_imp = pd.DataFrame(gbm.feature_importances_)

Is there any solution, which can help me to understand the important feature on the test or predict dataset with sklearn gbm or otherwise

Thanks for all the help!

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Manu Sharma
  • 1,593
  • 4
  • 25
  • 48
  • 1
    first of all 'fit' is a methode and as such doesn't have the feature importance call, the Gradient Boosting Object has this. In fact the feature importance is just going over the weight of every feature and normalize them. Your are then analysing the **trees** not the datas. You are not changing the weights of the trees when predicting. The weights then won't change. The Feature Importance won't either. To compare you can try fittting two different GBM and comparining the FI between them. – Frayal Jan 21 '19 at 12:28
  • Possible duplicate of [Using scikit to determine contributions of each feature to a specific class prediction](https://stackoverflow.com/questions/35249760/using-scikit-to-determine-contributions-of-each-feature-to-a-specific-class-pred) – Chris Jan 21 '19 at 13:39
  • @Alexis It is helpful – Manu Sharma Jan 21 '19 at 14:46

0 Answers0