Logistic regression - cbind command in glm

Question

I am doing logistic regression in R. Can somebody clarify what is the differences of running these two lines?

1. glm(Response ~ Temperature, data=temp, 
                    family = binomial(link="logit"))
2. glm(cbind(Response, n - Response) ~ Temperature, 
                    data=temp, family =binomial, Ntrials=n)

The data looks like this: (Note : Response is binary. 0=Die 1=Not die)

Response  Temperature
0         24.61
1         39.61
1         39.50
0         22.71
0         21.61
1         39.70
1         36.73
1         33.32
0         21.73
1         49.61

Paul...the first line is straight forward to understand. :). I tried to figure out the second one because some examples in R used it. AND..those two generates different result. :) — Eddie, Feb 02 '12 at 12:24
@James is right, I believe. If `n` is 1 then you should get exactly the same answer in this case. In general you should use the second form when you have more than one trial per observation. The `Ntrials` argument is bogus/unnecessary, as far as I can tell. — Ben Bolker, Feb 02 '12 at 13:12
Thank you very much Ben. Could you elaborate furtheron what do you mean by "more than one trial pr observation" please? :)- — Eddie, Feb 02 '12 at 15:39
Suppose your data are grouped so that you had measured multiple individuals (e.g. 10) at each temperature value; you then might have 7 out of 10 surviving at temp 22.71, so your estimation would be based on a binomial outcome of 7 surviving with probability p in N=10 trials. Usually when people say "logistic regression" they mean ungrouped data (`N=1`), reserving "binomial regression" for the grouped case, but the terms are somewhat interchangeable ... — Ben Bolker, Feb 02 '12 at 19:30

score 20 · Accepted Answer · answered Feb 02 '12 at 11:57

20

When doing the binomial or quasibinomial glm, you either supply a probability of success, a two-column matrix with the columns giving the numbers of successes and failures or a factor where the first level denotes failure and the others success on the left hand side of the equation. See details in ?glm.

answered Feb 02 '12 at 11:57

James

65,548
14
155
193

9

Note that when using the frequency form of a binomial glm, you should supply the number of observations per trial in the `weights` argument. It would look like: `glm(events/n ~ x, data=*, weights=n, ...)` – Hong Ooi Feb 02 '12 at 15:16

Logistic regression - cbind command in glm

1 Answers1