3

I am trying to evaluate the logistic model with residual plot in Python.
I searched on the internet and cannot get the info.
It seems that we can calculate the deviance residual from this answer.

from sklearn.metrics import log_loss
def deviance(X_test, true, model):
    return 2*log_loss(y_true, model.predict_log_proba(X_test))

This returns a numeric value.

However, we can evaluate residuals plot when performing GLM.... It seems that there are no packages for Python to plot logistic regression residuals, pearson or deviance.

Moreover, I found a interesting package ResidualsPlot. But I'm not sure whether it can be used for logistic regression.

Any suggestion for plotting residuals plot?

In addition, I also found a resource here, which is for ols rather than logit. It seems that the calculations of residuals are a little bit different.

Peter Chen
  • 1,464
  • 3
  • 21
  • 48
  • 1
    My initial reaction is to say you could just take the difference of the true values and the values predicted by the regression (the score), and plot those, since they are, in fact, the residuals. You could plot these against the explanatory variables to look for issues. Is your question more about how to find the residuals, find relationships, or do the actual mechanics of the plotting? – Savage Henry Jun 27 '19 at 15:33
  • I think the way you mentioned, the difference, is what `ResidualsPlot` do. However, my question is for logistic regression, it seems that there are `pearson residuals` and `deviance residuals`, how do I calculate these two and plot them in `Python`. – Peter Chen Jun 27 '19 at 15:39
  • I'll go with my usual answer: think about what you're trying to do. If you want to do some forensics on the model, the residuals are great for exploring whether you have some unmodeled relationships. The basic residuals (the true minus the score) could be plotted against x1, x2, etc. to see if there are obvious patters (say, one of them should be entered with a quadratic). The more sophisticated residual formulas might be interesting, but do they actually get at what you want? – Savage Henry Jun 27 '19 at 17:24
  • The more complex residual formulas, `pearson` or `deviance`, are good to use when modeling in logistic regression. Actually, I'm not sure whether it is good to solely use y_actual minus predicted. – Peter Chen Jun 27 '19 at 18:13

0 Answers0