0

I am new to using R. I am trying to estimate a latent class logit model using panel data. I tried following this example: https://rpubs.com/msarrias1986/335556.
I was told that the following code should work:

df01 <- mlogit.data(data, 
                      id = "ID", 
                      choice = "Choice", 
                      varying = 3:17, 
                      shape = "wide", 
                      sep = "")

lc <- gmnl(Choice ~ COST + REN + NUCL + OUTAGE180 + OUTAGE360 | 0 | 0 | 0 | 1 , 
           data = df01,
           model = 'lc', 
           Q = 3, 
           panel = TRUE,
           method = "bhhh")

With a basic datafile of 17 columns (see image), it works. However, when I add one more column, for example a dummy variable for gender, I get 2 errors:

  1. in the first command, I get the error "Error in reshapeLong(data, idvar = idvar, timevar = timevar, varying = varying, : 'varying' arguments must be the same length". I noticed that I can get rid of the error by stating 'varying = list(3:18)' instead of 'varying = 3:18', but I'm not sure if this is a correct way to deal with it.

  2. in the second command, I get the error "Error in eval(predvars, data, env) : object 'COST' not found". 'COST' is indeed not a variable, but 'COST_1' (i.e. the cost of the first alternative), 'COST_2' and 'COST_3' are. I want the coefficient for 'COST' to represent the importance of costs in choosing an alternative. This is similar for all other variables.

I find it curious that just adding 1 column to the datafile causes these errors. I hope someone has some good advice. Thanks for helping!

(example of my data in the included image).

enter image description here

duckmayr
  • 16,303
  • 3
  • 35
  • 53
  • Welcome to Stack Overflow! Help us help you: Provide a [mcve]. You've already done quite a bit to provide one, but one last thing would be really helpful: [edit] your question to include the output of `dput(data)` so that we can just copy and paste your data rather than providing your data as an image we can't do anything with. See [How to make a great R reproducible example](https://stackoverflow.com/q/5963269/8386140) for more details. – duckmayr May 06 '20 at 22:52

1 Answers1

0

I kept the command 'varying = 3:17' and changed the code to:

df01 <- mlogit.data(data, 
                      id = "ID", 
                      choice = "Choice",
                      varying = 3:17, 
                      shape = "wide", 
                      sep = "",
                      alt.levels = c("FOSS","REN","NUCL","COST","OUTAGE"))

lc <- gmnl(Choice ~ COST + REN + NUCL + OUTAGE | MALE | 0 | 0 | 1 , 
           data = df01,
           model = 'lc', 
           Q = 3, 
           panel = TRUE,
           method = "bhhh")

For less than 13 individual variables, this seems to work.