I'm doing binary logistic regression in R, and some of the independent variables represent ordinal data. I just want to make sure I'm doing it correctly. In the example below, I created sample data and ran glm() based on the assumption that the independent variable "I" represents continuous data. Then I ran it again using ordered(I) instead. The results came out a little bit differently, so it seems like a successful test. My question is whether it's doing what I think it's doing...e.g., it's seeing the integer data, coercing it to ordinal data based on the values of the integers, and running the glm() with a different formula to account for the idea that the distance between "1," "2," "3," etc. may not be the same, hence making it "correct" if this represents ordinal data. Is that correct?
> str(gorilla)
'data.frame': 14 obs. of 2 variables:
$ I: int 1 1 1 2 2 2 3 3 4 4 ...
$ D: int 0 0 1 0 0 1 1 1 0 1 ...
> glm.out = glm(D ~ I, family=binomial(logit), data=gorilla)
> summary(glm.out)
...tried it again with ordered:
glm.out = glm(D ~ ordered(I), family=binomial(logit), data=gorilla)
> summary(glm.out)
PS: In case it would help, here's the full output from these tests (one thing I'm noticing is the very large standard error numbers):
Call:
glm(formula = D ~ I, family = binomial(logit), data = gorilla)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.7067 -1.0651 0.7285 1.0137 1.4458
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.0624 1.2598 -0.843 0.399
I 0.4507 0.3846 1.172 0.241
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 19.121 on 13 degrees of freedom
Residual deviance: 17.621 on 12 degrees of freedom
AIC: 21.621
Number of Fisher Scoring iterations: 4
> glm.out = glm(D ~ ordered(I), family=binomial(logit), data=gorilla)
> summary(glm.out)
Call:
glm(formula = D ~ ordered(I), family = binomial(logit), data = gorilla)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.66511 -0.90052 0.00013 0.75853 1.48230
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.6557 922.4405 0.004 0.997
ordered(I).L 1.3524 1.2179 1.110 0.267
ordered(I).Q -9.5220 2465.3259 -0.004 0.997
ordered(I).C 0.1282 1.2974 0.099 0.921
ordered(I)^4 13.6943 3307.5816 0.004 0.997
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 19.121 on 13 degrees of freedom
Residual deviance: 14.909 on 9 degrees of freedom
AIC: 24.909
Number of Fisher Scoring iterations: 17
Data used:
I,D
1,0
1,0
1,1
2,0
2,0
2,1
3,1
3,1
4,0
4,1
5,0
5,1
5,1
5,1