Feature Importance Chart in neural network using Keras in Python

Question

I am using python(3.6) anaconda (64 bit) spyder (3.1.2). I already set a neural network model using keras (2.0.6) for a regression problem(one response, 10 variables). I was wondering how can I generate feature importance chart like so:

feature importance chart

def base_model():
    model = Sequential()
    model.add(Dense(200, input_dim=10, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal'))
    model.compile(loss='mean_squared_error', optimizer = 'adam')
    return model

clf = KerasRegressor(build_fn=base_model, epochs=100, batch_size=5,verbose=0)
clf.fit(X_train,Y_train)

score 37 · Answer 1 · edited Feb 09 '19 at 05:36

37

I was recently looking for the answer to this question and found something that was useful for what I was doing and thought it would be helpful to share. I ended up using a permutation importance module from the eli5 package. It most easily works with a scikit-learn model. Luckily, Keras provides a wrapper for sequential models. As shown in the code below, using it is very straightforward.

from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance

def base_model():
    model = Sequential()        
    ...
    return model

X = ...
y = ...

my_model = KerasRegressor(build_fn=base_model, **sk_params)    
my_model.fit(X,y)

perm = PermutationImportance(my_model, random_state=1).fit(X,y)
eli5.show_weights(perm, feature_names = X.columns.tolist())

edited Feb 09 '19 at 05:36

Akavall

82,592
51
207
251

answered Sep 19 '18 at 16:23

Justin Hallas

601
7
8

this line *eli5.show_weights(perm, feature_names = X.columns.tolist())* returns error: *AttributeError: module 'eli5' has no attribute 'show_weights'* – S34N Nov 14 '18 at 08:38
Traceback (most recent call last): File in eli5.show_weights(perm, feature_names = col) AttributeError: module 'eli5' has no attribute 'show_weights' – S34N Nov 14 '18 at 08:55
Not sure what the issue is. It works on my computer and is listed in documentation here: https://eli5.readthedocs.io/en/latest/overview.html Do you have the most recent version? – Justin Hallas Nov 15 '18 at 17:56
I had a chat with the eli5 developer; It turns out that the error: AttributeError: module 'eli5' has no attribute 'show_weights' is only displayed if I'm not using iPython Notebook, which I wasn't at the time of when the post was published. Strange phenomenon, but I will test it out with IPython installed. – S34N Nov 16 '18 at 11:41
eli5.show_weights outputs an HTML object, so it will only be displayed in iPython (jupyter) Notebook. – gradLife Aug 20 '19 at 20:34
2

why the sum of all the permutations (perm.feature_importances_) are not equal to one? – Henry Navarro Apr 02 '20 at 14:10
I would like to add that `eli5` currently only supports 2d arrays. If your model uses 3d layers like `GRU` or `LSTM`, `eli5` will not work for you. You need to use another library like `SHAP` instead. – user5305519 May 18 '20 at 03:21

score 18 · Answer 2 · answered May 18 '20 at 03:30

18

This is a relatively old post with relatively old answers, so I would like to offer another suggestion of using SHAP to determine feature importance for your Keras models. SHAP offers support for both 2d and 3d arrays compared to eli5 which currently only supports 2d arrays (so if your model uses layers which require 3d input like LSTM or GRU, eli5 will not work).

Here is the link to an example of how SHAP can plot the feature importance for your Keras models, but in case it ever becomes broken some sample code and plots are provided below as well (taken from said link):


import shap

# load your data here, e.g. X and y
# create and fit your model here

# load JS visualization code to notebook
shap.initjs()

# explain the model's predictions using SHAP
# (same syntax works for LightGBM, CatBoost, scikit-learn and spark models)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)

# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
shap.force_plot(explainer.expected_value, shap_values[0,:], X.iloc[0,:])

shap.summary_plot(shap_values, X, plot_type="bar")

answered May 18 '20 at 03:30

user5305519

3,008
4
26
44

4

Error when using `DeepExplainer`: `keras is no longer supported, please use tf.keras instead.` – Kermit Jun 27 '20 at 19:18
8

Error when using `TreeExplainer` `SHAPError: Model type not yet supported by TreeExplainer: ` – Kermit Jun 27 '20 at 19:18
@HashRocketSyntax I assume you are trying to use `Sequential` layer from Keras. Can you try importing `Sequential` using this instead? `from tensorflow.keras import Sequential` – user5305519 Jun 28 '20 at 03:46
3

@jarrettyeo, `from tensorflow.keras import Sequential` still doesn't work. I get the error: `Exception: Model type not yet supported by TreeExplainer: ` – Mitch Oct 21 '20 at 21:30
@user5305519 can you provide the solution to any of the above questions? I am also getting this error: Exception: Model type not yet supported by TreeExplainer: – 傅能杰 Dec 13 '21 at 16:26
@Kermit refer to https://stackoverflow.com/a/72480697/13046931 – seth Dec 04 '22 at 23:27

score 7 · Answer 3 · answered Jul 28 '17 at 10:34

7

At the moment Keras doesn't provide any functionality to extract the feature importance.

You can check this previous question: Keras: Any way to get variable importance?

or the related GoogleGroup: Feature importance

Spoiler: In the GoogleGroup someone announced an open source project to solve this issue..

answered Jul 28 '17 at 10:34

paolof89

1,319
5
17
31

Why not do Feature Importance with sklearn_RandomForest ? – JeeyCi May 25 '23 at 13:19

score 1 · Answer 4 · answered Jul 17 '23 at 02:48

A lame way is to get weights for each neuron in each layer and show/stack them together.

feature_df = pd.DataFrame(columns=['feature','layer','neuron','weight','abs_weight'])

for i,layer in enumerate(model.layers[:-1]): 
    w = layer.get_weights()
    w = np.array(w[0])
    n = 0
    for neuron in w.T:
        for f,name in zip(neuron,X.columns):
            feature_df.loc[len(feature_df)] = [name,i,n,f,abs(f)]
        
        n+=1
        
feature_df = feature_df.sort_values(by=['abs_weight'])
feature_df.reset_index(inplace=True)
feature_df = feature_df.drop(['index'], axis=1)

fig = px.bar(feature_df,x='feature',y='abs_weight',template='simple_white')
fig.show()

It gives something like this, x-axis is your features:

Feature Importance Chart in neural network using Keras in Python

4 Answers4

Linked