2

I am working with a large data set and want to run a logit regression on monthly data. For this I create a DataFrame and use the GLM package in Julia. My code looke something like that:

f=glm((Y ~  Age + Duration + Gender + Nationality + MonthIn), Data2000, Binomial(), LogitLink())

My question is, as I have monthly data I want to create dummy variables for the 12 months, or eleven when I want to use a constant. The MonthIn is just a column which has numbers for the month (eg 3 for march). I do not want to run the regression on this, just included it here to explain it easier.

Now when I tried to find how this is done I just learned that in R this possibility as it is build into some regression methods s.t. it can automatically create monthly dummies. This is, I think, not the case for Julia. Now one guess of mine would be to use the pooling data function build in the dataframe.jl to create an indicator matrix, but I am not sure how this or something similar would be done. Or just how to create the dummies by hand.

I highly appreciate any help and please feel free to ask if my question is not clear.

Cheers

PS: From this question I know that I have to create a Pooled Data Array but I am not sure how it is done. Dummy Variables in Julia

DoubleBass
  • 107
  • 1
  • 7
  • 1
    OK, I figured it out in the end. The thing I had to do was to pool just the column with the month numbers, pool!(Data2000, [:MonthIn]) to get it. After it worked with the glm logit function, and also already got rid of the first month as to not have collinearity. – DoubleBass May 03 '17 at 11:46

0 Answers0