0

While reading the paper "A Unified Approach to Interpreting Model Predictions" by Lundberg and Lee (https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf), on page 3 I see:

Shapley sampling values are meant to explain any model by: (1) applying sampling approximations to Equation 4, and (2) approximating the effect of removing a variable from the model by integrating over samples from the training dataset. This eliminates the need to retrain the model and allows fewer than pow(2,|F|) differences to be computed. Since the explanation model form of Shapley sampling values is the same as that for Shapley regression values, it is also an additive feature attribution method.

My question is: how does sampling from the training dataset eliminate the need to retrain models? It is not obvious to me and I cannot think of a mathematical proof. Any reference or explanation would be greatly appreciated. My internet searches have been unsuccessful. Thank you.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • If you wanted an exact solution you would need to retrain the model once the set of features had been changed. Lundberg et al suggested averaging (integrating) over removed features. Thus the gain in speed. – Sergey Bushmanov Jan 12 '23 at 04:34
  • Even more: if you would retrain the model with different features, it would be a different model. Then, you would rather explain a modeling *technique* instead of a specific model. – Michael M Jan 14 '23 at 11:05

0 Answers0