My task is to understand which features (situated in columns of X dataset) are the best in predicting target variable - y. I've decided to use feature_importances_ in RandomForestClassifier. RandomForestClassifier have best score (aucroc), when max_depth=10 and n_estimators = 50. Is it correct to use feature_importances_ with best parameters, or default parameters? Why? How does feature_importances_ work?
There are to models with best and default parameters for example.
1)
model = RandomForestClassifier(max_depth=10,n_estimators = 50)
model.fit(X, y)
feature_imp = pd.DataFrame(model.feature_importances_, index=X.columns, columns=["importance"])
2)
model = RandomForestClassifier()
model.fit(X, y)
feature_imp = pd.DataFrame(model.feature_importances_, index=X.columns, columns=["importance"])