0

So when feature_pertrubation='tree_path_dependent' the data argument is optional. If we give a background dataset, do we have the same behaviour as if feature_pertrubation='interventional?

From my minimal example that's what it seems like, at least for expected_value:

import shap
import numpy as np
from sklearn.tree import DecisionTreeRegressor

num_points = 500
num_samples = 100
num_features = 5
rng = np.random.default_rng(seed=1)
X = rng.normal(size=(num_points, num_features))
y = rng.integers(2, size=(num_points,))
X_sample = X[np.random.randint(X.shape[0], size=num_samples), :]

dt_model = DecisionTreeRegressor(max_depth=2).fit(X, y)
explainer1 = shap.TreeExplainer(dt_model, feature_perturbation='tree_path_dependent', model_output='raw')
explainer2 = shap.TreeExplainer(dt_model, feature_perturbation='tree_path_dependent', data=X_sample, model_output='raw')       
explainer3 = shap.TreeExplainer(dt_model, feature_perturbation='interventional', data=X_sample, model_output='raw')                                          
print(f'explainer1.expected_value = {explainer1.expected_value}')
print(f'explainer2.expected_value = {explainer2.expected_value}')
print(f'explainer3.expected_value = {explainer3.expected_value}')
explainer1.expected_value = [0.514]
explainer2.expected_value = 0.5139024767801856
explainer3.expected_value = 0.5139024767801856
  • I found a possible reason why the authors of shap kept both of the background_set and the feature perturbation option. Basically they allow the user to give a background set with tree_path_dependent as long as the passed background set lands on every leaf. For interventional shap, they do not have this requirement. They added this error when someone passes a background set that does not land on every leaf: [Line 263 in _tree.py](https://github.com/slundberg/shap/blob/45b85c1837283fdaeed7440ec6365a886af4a333/shap/explainers/_tree.py#L263) – Kyriacos Xanthos Feb 22 '23 at 16:18
  • 2
    I think you're on the right track when you explore the source code. There are lots of subjective decisions there and to be aware of them one needs to go the full path from instantiation of explainer object till presenting results – Sergey Bushmanov Feb 22 '23 at 18:27

0 Answers0