1

I have a random forest classifier which gave me a feature importance rank.

How can I derive statistical significance of the important features, similar to a regression model where you can infer statistical significance of the betas?

noiivice
  • 400
  • 2
  • 15

1 Answers1

0

Your question is a bit too broad and unclear.

An easy way you can look at the feature_importance_values as percentage is by normalizing their values:

importance_sum = sum(clf. feature_importances_)
feature_importance_as_percent =  [100*(x/sum) for x in clf.feature_importances_]

Other methods would involve parametric or non-parametric tests.

Read also this: How are feature_importances in RandomForestClassifier determined?

seralouk
  • 30,938
  • 9
  • 118
  • 133