What is a "masker" when determining SHAP values in Python?

Question

I am playing around with the SHAP Python module to get some insights in feature importance for my models. Since I'm applying various algorithms to my data, I was trying to define some generic lines of code at the end of my Jupyter notebook to generate the SHAP-values.

I had some trouble finding clear examples in the original documentation, but with some help from various sources I ended up with the code below. This works for xgboost XGBRegressor and also for sklearn RandomForestRegressor/LinearRegression/Lasso etc. So far so good!

# Initiate the Explainer.
masker = shap.maskers.Independent(data=X_train, max_samples=100)
explainer = shap.Explainer(model=model, masker=masker)

# Generate the Explanations.
all_explanations = explainer(X_test)

# Extract the SHAP values.
shap_values = all_explanations.values

What I figured out while working towards this solution, is that a LinearExplainer requires a "masker" parameter and a TreeExplainer doesn't (but it may be passed). Since I wanted generic code, I use the generic Explainer class including a "masker".

But I don't really understand what this "masker" is and/or does, and why it is required for one type of Explainer and not for the other? I cannot really find a clear conceptual explanation in the module documentation. On Medium there are numerous articles explaining about SHAP, but most of them use Tree-like models for which the "masker" is not required, and therefore it is not explained. Can anyone explain, or point to a good reference?

Edit
My question was marked as a duplicate of this question, but I'm looking for a little more in-depth explanation. Why is a "masker" required for one type of Explainer, but optional for the other? And what does it mean to provide the complete X_train as a "masker"? And what does the max_samples parameter do? I don't really find those answers in the other question.

I did find this useful resource, which got me a little further in understanding the background dataset (for which the "masker" is used).
I did find this useful resource, which explains that you should use X_train for the definition of the "masker" (as opposed to X_test).

What is a "masker" when determining SHAP values in Python?

0 Answers0