samples.zip The sample zipped folder contains:
- model.pkl
- x_test.csv
To reproduce the problems, do the following steps:
- use
lin2 =joblib.load('model.pkl')
to load the linear regression model - use
x_test_2 = pd.read_csv('x_test.csv').drop(['Unnamed: 0'],axis=1)
to load thex_test_2
- run the code below to load the explainer
explainer_test = shap.Explainer(lin2.predict, x_test_2)
shap_values_test = explainer_test(x_test_2)
- Then run
partial_dependence_plot
to see the error message:
ValueError: x and y can be no greater than 2-D, but have shapes (2,) and (2, 1, 1)
sample_ind = 3
shap.partial_dependence_plot(
"new_personal_projection_delta",
lin.predict,
x_test, model_expected_value=True,
feature_expected_value=True, ice=False,
shap_values=shap_values_test[sample_ind:sample_ind+1,:]
)
- Run another function to plot waterfall to see error message:
Exception: waterfall_plot requires a scalar base_values of the model output as the first parameter, but you have passed an array as the first parameter! Try shap.waterfall_plot(explainer.base_values[0], values[0], X[0]) or for multi-output models try shap.waterfall_plot(explainer.base_values[0], values[0][0], X[0]).
shap.plots.waterfall(shap_values_test[sample_ind], max_display=14)
Questions:
- Why I cannot run
partial_dependence_plot
&shap.plots.waterfall
? - What changes I need to do with my input so I can run the methods above?