Is there any way to calculate residual deviance of a scikit-learn logistic regression model? This is a standard output from R model summaries, but I couldn't find it any of sklearn's documentation.
4 Answers
- As suggested by @russell-richie, it should be
model.predict_proba
- Don't forget the argument
normalize=False
in functionmetrics.log_loss()
to return the sum of the per-sample losses.
So to complete @ingo's answer, to obtain the model deviance with sklearn.linear_model.LogisticRegression
, you can compute:
def deviance(X, y, model):
return 2*metrics.log_loss(y, model.predict_proba(X), normalize=False)

- 14,783
- 16
- 84
- 132

- 96
- 1
- 6
Actually, you can. Deviance is closely related to cross entropy, which is in sklearn.metrics.log_loss
. Deviance is just 2*(loglikelihood_of_saturated_model - loglikelihood_of_fitted_model). Scikit learn can (without larger tweaks) only handle classification of individual instances, so that the log-likelihood of the saturated model is going to be zero. Cross entropy as returned by log_loss
is the negative log-likelihood. Thus, the deviance is simply
def deviance(X, y, model):
return 2*metrics.log_loss(y, model.predict_log_proba(X))
I know this is a very late answer, but I hope it helps anyway.

- 1,103
- 8
- 17
-
1Should it be `model.predict_log_proba`, or just `.predict_proba`? The log_loss documentation indicates that the second argument `y_pred` should be probabilities... – Russell Richie Jun 18 '19 at 17:12
-
Correct, the documentation is unspecific here. It should be `log_proba`. – Ingo May 01 '20 at 14:26
-
The documentation states that the second parameter should be probabilites, not log(probabilities). – Manuel Aug 04 '20 at 06:12
Here is a python implementation of explained_deviance that implements the discussions from this thread: Github code
import numpy as np
from scipy.special import softmax, expit
from sklearn.metrics import log_loss
from sklearn.dummy import DummyClassifier
# deviance function
def explained_deviance(y_true, y_pred_logits=None, y_pred_probas=None,
returnloglikes=False):
"""Computes explained_deviance score to be comparable to explained_variance"""
assert y_pred_logits is not None or y_pred_probas is not None, "Either the predicted probabilities \
(y_pred_probas) or the predicted logit values (y_pred_logits) should be provided. But neither of the two were provided."
if y_pred_logits is not None and y_pred_probas is None:
# check if binary or multiclass classification
if y_pred_logits.ndim == 1:
y_pred_probas = expit(y_pred_logits)
elif y_pred_logits.ndim == 2:
y_pred_probas = softmax(y_pred_logits)
else: # invalid
raise ValueError(f"logits passed seem to have incorrect shape of {y_pred_logits.shape}")
if y_pred_probas.ndim == 1: y_pred_probas = np.stack([1-y_pred_probas, y_pred_probas], axis=-1)
# compute a null model's predicted probability
X_dummy = np.zeros(len(y_true))
y_null_probas = DummyClassifier(strategy='prior').fit(X_dummy,y_true).predict_proba(X_dummy)
#strategy : {"most_frequent", "prior", "stratified", "uniform", "constant"}
# suggestion from https://stackoverflow.com/a/53215317
llf = -log_loss(y_true, y_pred_probas, normalize=False)
llnull = -log_loss(y_true, y_null_probas, normalize=False)
### McFadden’s pseudo-R-squared: 1 - (llf / llnull)
explained_deviance = 1 - (llf / llnull)
## Cox & Snell’s pseudo-R-squared: 1 - exp((llnull - llf)*(2/nobs))
# explained_deviance = 1 - np.exp((llnull - llf) * (2 / len(y_pred_probas))) ## TODO, not implemented
if returnloglikes:
return explained_deviance, {'loglike_model':llf, 'loglike_null':llnull}
else:
return explained_deviance

- 133
- 2
- 8