There is a proposal to implement this in Sklearn
#15075, but in the meantime, eli5
is suggested as a solution. However, I'm not sure if I'm using it the right way. This is my code:
from sklearn.datasets import make_friedman1
from sklearn.feature_selection import RFECV
from sklearn.svm import SVR
import eli5
X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
estimator = SVR(kernel="linear")
perm = eli5.sklearn.PermutationImportance(estimator, scoring='r2', n_iter=10, random_state=42, cv=3)
selector = RFECV(perm, step=1, min_features_to_select=1, scoring='r2', cv=3)
selector = selector.fit(X, y)
selector.ranking_
#eli5.show_weights(perm) # fails: AttributeError: 'PermutationImportance' object has no attribute 'feature_importances_'
There are a few issues:
I am not sure if I am using cross-validation the right way.
PermutationImportance
is usingcv
to validate importance on the validation set, or cross-validation should be only withRFECV
? (in the example, I usedcv=3
in both cases, but not sure if that's the right thing to do)If I uncomment the last line, I'll get a
AttributeError: 'PermutationImportance' ...
is this because I fit usingRFECV
? what I'm doing is similar to the last snippet here: https://eli5.readthedocs.io/en/latest/blackbox/permutation_importance.htmlas a less important issue, this gives me a warning when I set
cv
ineli5.sklearn.PermutationImportance
:
.../lib/python3.8/site-packages/sklearn/utils/validation.py:68: FutureWarning: Pass classifier=False as keyword args. From version 0.25 passing these as positional arguments will result in an error warnings.warn("Pass {} as keyword args. From version 0.25 "
The whole process is a bit vague. Is there a way to do it directly in Sklearn
? e.g. by adding a feature_importances
attribute?