Get individual features importance with XGBoost

Question

I have trained an XGBoost binary classifier and I would like to extract features importance for each observation I give to the model (I already have global features importance).

More specifically, I am looking for a way to determine, for each instance given to the model, which features have the most impact and make the input belong to one class or another. I would like to know something like the top 5 features which make the observation belong to some class and indications on how I should modify these 5 features so that the probability of belonging to this class decreases or increases.

For example, let’s say my model predicts whether a house costs more than 100,000 dollars (this is the positive class) based on its location, surface and number of bedrooms. I give it the following input: London, 400 square foots, 4 bedrooms and my model predicts a probability of 56% for the house to be in the positive class. I am looking for a Python module or a function that would show the most influential features for each observation.

score 3 · Answer 1 · answered Aug 02 '19 at 09:48

There are several different methods for that. You can use native importance measures from xgboost library. Check this answer: https://stackoverflow.com/a/51645066/3733974

You can also look for alternative methods. Here are two of them I can recommend:

Permutation importance. Basically, you permute the values of each of your predictor and check the loss of accuracy for each of them. Here is an article explaining it:
SHAP (SHapley Additive exPlanation) values A nice article that explains SHAP values, as well as the native importance measures of XGBoost:

Get individual features importance with XGBoost

1 Answers1