2

I am trying to asses the odds of people staying in a program given their backgrounds following these instructions. One of the variables I am looking at is age, which I split into five groups. I have run a test using the formula:

mylogit15 <- glm(Stay_in_Progams ~ Age.Group + Prior_Experience,
                 data = mydata, family = "binomial")

The results of the test are clear enough, except I am missing the first and third age groups. This is what they look like:

Coefficients:

             Estimate        Std. Error    z-value      Pr(>|z|)
(Intercept)   -.298             1.173      -1.98          .047
Age Group2    1.201             1.243       0.966         .333
Age Group4    2.735             1.486       1.840         .065
Age Group5    1.636             1.673       0.965         .334
Prior_Exp     3.546             1.234       2.735         .006

Thank you for taking the time to read this and help me out!

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
safc
  • 37
  • 3
  • Please print the full output of running `glm(Stay_in_Progams ~ Age.Group + Prior_Experience,data = mydata, family = "binomial")`; I suspect those groups were dropped due to lack of variation. – MichaelChirico Aug 10 '15 at 22:34
  • 2
    With logistic regression, one class of any factor you include is always left out as a reference category. By default, it's the first class, probably Age Group 1 here. – ulfelder Aug 10 '15 at 22:35
  • 4
    For a categorical variable R choses the first value as the baseline so you will expect to not see Age Group1, but I am not sure why Age Group 3 will be missing. Can you confirm if that group exists in your data using summary(mydata) – Rohit Das Aug 10 '15 at 22:37
  • Also, checkout the using `- 1`, e.g., `glm(Stay_in_Progams ~ Age.Group + Prior_Experience - 1, data = mydata, family = "binomial")` – Richard Erickson Aug 10 '15 at 22:59
  • I think the model output returns the specific factor with NA values in the case of lack of variation. Something else is happening with Group3 here.... – AntoniosK Aug 10 '15 at 23:12
  • I can confirm that Group 3 exists. It has 3 of the 34 subjects. Doing the formula that Richard suggested returned Group 1 in the results instead of the intercept, so thank you Richard. I am working on uploading a picture for Michael. Thank you ufelder and Rohit Das as well I had not understood that. – safc Aug 10 '15 at 23:21
  • 1
    Reviewing the data I believe that Group 3 was dropped due to lack of data entered. Thank you all! – safc Aug 10 '15 at 23:27
  • I had the same problem and two categories had been dropped due to insufficient variance because of very few observations. Its always helpfull to do a `table`on the categorical column to see what is inside. – Joschi Dec 23 '20 at 11:52

0 Answers0