Why am I getting "algorithm did not converge" and "fitted prob numerically 0 or 1" warnings with glm?

Question

So this is a very simple question, just can't seem to figure it out.

I'm running a logit using the glm function, but keep getting warning messages relating to the independent variable. They're stored as factors and I've changed them to numeric but had no luck. I also coded them to 0/1 but that did not work either.

Please help!

> mod2 <- glm(winorlose1 ~ bid1, family="binomial")
Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred

I also tried it in Zelig, but similar error:

> mod2 = zelig(factor(winorlose1) ~ bid1, data=dat, model="logit")
How to cite this model in Zelig:
Kosuke Imai, Gary King, and Oliva Lau. 2008. "logit: Logistic Regression for Dichotomous Dependent Variables" in Kosuke Imai, Gary King, and Olivia Lau, "Zelig: Everyone's Statistical Software," http://gking.harvard.edu/zelig
Warning messages:
1: glm.fit: algorithm did not converge 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred

EDIT:

> str(dat)
'data.frame':   3493 obs. of  3 variables:
 $ winorlose1: int  2 2 2 2 2 2 2 2 2 2 ...
 $ bid1      : int  700 300 700 300 500 300 300 700 300 300 ...
 $ home      : int  1 0 1 0 0 0 0 1 0 0 ...
 - attr(*, "na.action")=Class 'omit'  Named int [1:63021] 3494 3495 3496 3497 3498 3499 3500 3501 3502 3503 ...
  .. ..- attr(*, "names")= chr [1:63021] "3494" "3495" "3496" "3497" ...

This will be impossible to answer without some detailed information about your data. `str(dat)` for instance. Also, those are warnings, not errors. There's a big difference. — joran, Dec 21 '11 at 20:58
I just wanted to note that there is a `glm2` package which claims to achieve convergence where `glm` does not. I don't know if this has to do with the problem here or not. See http://journal.r-project.org/archive/2011-2/RJournal_2011-2_Marschner.pdf — Xu Wang, Dec 22 '11 at 06:46
As it seems that you're working with categorical data, I'd consider casting your integer variables as factors. dat$home <- as.factor(dat$home) — eamo, Sep 20 '13 at 16:11
some more information [on http://stats.stackexchange.com](http://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression) — DJJ, Jun 14 '14 at 08:33
Added some possible solutions, with reference to concrete packages you could try... — Tom Wenseleers, Jan 03 '19 at 15:52

joran · Answer 1 · 2014-04-04T21:17:32.710

47

If you look at ?glm (or even do a Google search for your second warning message) you may stumble across this from the documentation:

For the background to warning messages about ‘fitted probabilities numerically 0 or 1 occurred’ for binomial GLMs, see Venables & Ripley (2002, pp. 197–8).

Now, not everyone has that book. But assuming it's kosher for me to do this, here's the relevant passage:

There is one fairly common circumstance in which both convergence problems and the Hauck-Donner phenomenon can occur. This is when the fitted probabilities are extremely close to zero or one. Consider a medical diagnosis problem with thousands of cases and around 50 binary explanatory variable (which may arise from coding fewer categorical variables); one of these indicators is rarely true but always indicates that the disease is present. Then the fitted probabilities of cases with that indicator should be one, which can only be achieved by taking β_i = ∞. The result from glm will be warnings and an estimated coefficient of around +/- 10. There has been fairly extensive discussion of this in the statistical literature, usually claiming non-existence of maximum likelihood estimates; see Sautner and Duffy (1989, p. 234).

One of the authors of this book commented in somewhat more detail here. So the lesson here is to look carefully at one of the levels of your predictor. (And Google the warning message!)

edited Apr 04 '14 at 21:17

answered Dec 21 '11 at 21:21

joran

169,992
32
429
468

9

+1 Good answer. Just to add: it's good to look at the model, the model diagnostics, and sometimes a different model. For instance, try a classification tree. This may tell you that either (a) you have an excellent predictor (good thing), or (b) you have some sampling problems (bad thing). – Iterator Dec 22 '11 at 01:16
5

Does this answer address only the 2nd warning from the OP's question? I found at http://discuss.analyticsvidhya.com/t/warning-message-glm-fit-algorithm-did-not-converge/5299 the suggestion to adjust the parameter `maxit` (which is not listed in the documentation for `glm`, but is passed as part of the `control` parameter to `glm.fit` and then to `glm.control`), and that seems to have resolved the 1st warning `1: glm.fit: algorithm did not converge` for me. – Paul de Barros Apr 14 '16 at 12:43
I found your answer very useful joran, but I still don't understand how to solve the problem based on your answer. My understanding (based on the quote in your answer) is that: one of the levels of one of my predictor variables is rarely true but always indicates that the the out come variable is either 0 or 1. Firstly, surely any decent statistical method should be able to deal with this? Secondly, how do I find the predictor variable, and once I do find it what do I do with it? – Parsa Jul 05 '16 at 14:33
2

@par An algorithmic approach to "solving" this problem is often to employ some form of regularization. But it is also wise to reconsider your choices of covariates in the context of your model, and how meaningful they might be. As the quote indicates, you can often spot the problem variable by looking for a coefficient of +/- 10. – joran Jul 05 '16 at 15:02

score 9 · Answer 2 · answered Jan 03 '19 at 15:51

This is probably due to complete separation, i.e. one group being entirely composed of 0s or 1s.

There are several options to deal with this:

(a) Use Firth's penalized likelihood method, as implemented in the packages logistf or brglm in R. This uses the method proposed in Firth (1993), "Bias reduction of maximum likelihood estimates", Biometrika, 80,1.; which removes the first-order bias from maximum likelihood estimates.

(b) By using median-unbiased estimates in exact conditional logistic regression. Package elrm or logistiX in R can do this.

(c) Use LASSO or elastic net regularized logistic regression, e.g. using the glmnet package in R.

(d) Go Bayesian, cf. the paper Gelman et al (2008), "A weakly informative default prior distribution for logistic & other regression models", Ann. Appl. Stat., 2, 4 and function bayesglm in the arm package.

(e) Use a hidden logistic regression model, as described in Rousseeuw & Christmann (2003),"Robustness against separation and outliers in logistic regression", Computational Statistics & Data Analysis, 43, 3, and implemented in the R package hlr.

You need to recode your factor as a factor first though using dat$bid1 = as.factor(dat$bid1))

Solutions to this problem are also discussed here:

https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression

https://stats.stackexchange.com/questions/45803/logistic-regression-in-r-resulted-in-perfect-separation-hauck-donner-phenomenon

https://stats.stackexchange.com/questions/239928/is-there-any-intuitive-explanation-of-why-logistic-regression-will-not-work-for

https://stats.stackexchange.com/questions/5354/logistic-regression-model-does-not-converge?rq=1

score 2 · Answer 3 · answered Dec 19 '18 at 23:33

2

If you have correctly specified the GLM formula and the corresponding inputs (i.e., design matrix, link function etc...). The glm algorithm may not converge due to not enough iterations used in the iteratively re-weighted least squares (IRLS) algorithm. Change maxit=25 (Default) to maxit=100 in R.

answered Dec 19 '18 at 23:33

user10813428

21
1

1

that's possible, but it's rather unusual (in my experience at least) that glm fails to converge in 25 iterations but succeeds in 100 ... (and doesn't explain the second warning message) – Ben Bolker Dec 20 '18 at 00:34

Why am I getting "algorithm did not converge" and "fitted prob numerically 0 or 1" warnings with glm?

3 Answers3

Linked

Related