2

I have error below trying to use glm and can't fix it? How can I solve this problem?

setwd("C:/Users/ali.erkun/Desktop/524")
data<-read_excel("Take Home Dataset.xls")
attach(data)

L12M_LOAN_ACCEPT_2<-as.numeric(L12M_LOAN_ACCEPT)
#Zorlamadan dolayı ortaya çıkan NAs - we must omitt NAs
L12M_omitted<-na.omit(L12M_LOAN_ACCEPT_2)

summary(L12M_omitted)

data$L12M_LOAN_ACCEPT <-as.numeric(factor(ifelse(data$L12M_LOAN_ACCEPT == ".", 0,data$L12M_LOAN_ACCEPT)))

str(data)

fit <- glm(LABEL ~ ., family = binomial(link = "logit"), data = data, )

loanlogit_full <- glm(LABEL ~ ., family = binomial(link = "logit"), data = data) Warning messages: 1: glm.fit: algorithm did not converge 2: glm.fit: fitted probabilities numerically 0 or 1 occurred

Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22
Ali Erkun
  • 21
  • 1
  • 3
  • 1
    You can see a similar question and its answer here: https://stats.stackexchange.com/questions/5354/logistic-regression-model-does-not-converge – Diep N. Apr 05 '20 at 18:51
  • I would suspect that there is a discrete variable (might be high income, woman, obtained previous loan before ?), for which all the loan applications were accepted. So if for every customer who already applied successfully a new loan is accepted the next time the probability is 1 and there is no need to estimate it. The same idea goes if a loan application from a customer with job category A is systematically rejected. Here you need to understand better your data and your model. – DJJ Apr 07 '20 at 09:17

1 Answers1

0

It's likely that you can achieve perfect separation of your response with one or more of your predictor variables. See this answer and the one that @John.G linked. If one of your independent variables perfectly predicts the outcome, the coefficient for that predictor will grow without bound (and therefore won't converge).

There are a number of ways to deal with this problem - https://stats.stackexchange.com/a/68917/275337 gives a very good overview of the options.

RyanFrost
  • 1,400
  • 7
  • 17