0

I'm pretty new to R and still struggle with quite basic stuff. Perhaps someone can help with the following. I want to conduct a logistic regression using a reference category and basically don't know how.

When I started my project I read my dataset in with convert.factors=FALSE, so I've got no factors. However, to relevel a variable to use as a reference category, by my understanding the variable needs to be a factor in order to use this function? I converted away from factors to help with cleaning the variable, e.g i used an ifelse formula to remove -1 and -2 non-responses.

say I'm doing the following:

votemodel <- glm(vote ~ Constant + partymembership, data=bes, family=binomial())

Where party Membership has 4 categories and I would like the coefficients for each party reported separately as with a reference category.

Can any clever souls advise how I can do this?

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Henry Cann
  • 101
  • 2
  • 1
    There isn’t really a reason why you couldn’t use factors here, is there? That said, if you don’t want factors you can convert them to strings with `as.factor`. – Konrad Rudolph Feb 22 '15 at 13:41
  • If you want an estimate for each party, you would probably fit an intercepty-free model: `glm(vote ~ Constant + partymembership-1, data=bes, family=binomial())` otherwise what you're asking for doesn't make sense statistically. – MrFlick Feb 22 '15 at 13:43
  • Thanks both. MrFlick - soryr for being an ignoramus but theoretically speaking what does the -1 do - is this a standard way to remove the intercept? Thanks – Henry Cann Feb 22 '15 at 13:48
  • also, If do the following: new <- glm(cons ~ 1 + partyid-1, data=bes10, family=binomial()) I only get the printed coefficient for the variable as a whole - how can I break it down into the different coefficients for each party within the variable? – Henry Cann Feb 22 '15 at 13:59
  • This would be easier if you provided a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). You can't add a literal constant value in a formula. I'm guessing you would want to specify an offset. And make sure `partyid` is a factor before running the model. If it's numeric, it will be treated as a continuous covariate and just return one slope. – MrFlick Feb 22 '15 at 14:22
  • Ahaa - your final sentence made the key difference. I had converted my party id numeric variable to a factorial variable called 'party' - forgetting to use this new one. Plugging in 'Party' instead of 'partyid' worked and produced 10 different slopes as expected. I also took out the constant. In other news, it doesn't seem to make any difference whether I use 'Party' or 'Party - 1' - Sorry to go back to this but could you clarify what the -1 is supposed to do here? – Henry Cann Feb 22 '15 at 14:49

0 Answers0