0

I'm trying to run a binominal GLM (in R but open to testing other software), my DV is a y/n, my IDV's include such things as gender, age by group (neonate, subadult, ect), length, weight an a couple others.

I want to know whether I need dummy variables and if so, how I convert factors with more than 2 possible outcomes into dummy variables.

Joris Meys
  • 106,551
  • 31
  • 221
  • 263
jordan
  • 1
  • 1
  • 2
  • 3
    R should take care of making dummy variables for factor variables so just make sure your variables are formatted as factors. It would be easier to be more specific with a proper [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick Mar 14 '17 at 21:50

1 Answers1

0

You are talking about design matrix, a matrix with rows for your observations and columns with the coefficients (that includes your coefficient and dummy variables).

R will automatically create a design matrix for you internally with model.matrix, so you don't have to do anything. Just make sure you specify the correct variables. Your categorical variables should be stored as a factor.

If you want to be convinced, type glm in R for the source code. You will see this:

X <- if (!is.empty.model(mt)) 
    model.matrix(mt, mf, contrasts)

Yes. You can create your own design matrix and give it to the glm function. ?glm gives you:

glm.fit(x, y ...

Follow the documentation.

ABCD
  • 7,914
  • 9
  • 54
  • 90
  • 1
    But note that categorical variables should either be stored as or converted to factor before doing this – Dason Mar 15 '17 at 00:07