I am having some trouble with the following code:
model4 = glm(data = data16, Loan_Status_Coded ~ Coapplicant_Income_Modified +
Dependents_SelfEmployed_1 + Dependents_Imputed_0_Dummy +
Dependents_Imputed_1_Dummy + Dependents_Imputed_2_Dummy+
Self_Employed_Imputed_Coded + Credit_History_Married + Married_Imputed_Coded +
sqrt_LoanAmount_Imputed + Loan_Amount_Term_Imputed_Low_Dummy +
Loan_Amount_Term_Imputed_Medium_Dummy + Credit_History_Imputed +
Education_Coded + Property_Area_Semiurban_Dummy + Property_Area_Rural_Dummy,
family = binomial(link = "logit"))
summary(model4)
predict5 = predict(data = data16, model4, type = "response")
table(data16$Loan_Status_Coded, predict5>0.5)
Running the table
function gives the following error:
"all arguments must have the same length"
It seems the number of rows in predict5 is less than the number of rows in data16.
If I use predict5 = predict(newdata = data16, model4, type = "response"), then the error does not occur, but the number of data points decreases. For instance the output on using newdata is:
FALSE TRUE
0 40 39
1 7 176
but data16
has 614 rows.
What am I doing wrong here?