1

Following Exercise 3 of the mlogit package https://cran.r-project.org/web/packages/mlogit/vignettes/e3mxlogit.html, but attempting to use my own data (see below)

structure(list(Choice.Set = c(4L, 5L, 7L, 8L, 10L, 12L), Alternative = c(2L, 
1L, 1L, 2L, 2L, 2L), respondent = c(1L, 1L, 1L, 1L, 1L, 1L), 
    code = c(7L, 9L, 13L, 15L, 19L, 23L), Choice = c(1L, 1L, 
    1L, 1L, 1L, 1L), price1 = c(0L, 0L, 1L, 1L, 0L, 0L), price2 = c(0L, 
    1L, 0L, 0L, 1L, 1L), price3 = c(0L, 0L, 0L, 0L, 0L, 0L), 
    price4 = c(1L, 0L, 0L, 0L, 0L, 0L), price5 = c(0L, 0L, 0L, 
    0L, 0L, 0L), zone1 = c(0L, 0L, 0L, 1L, 1L, 1L), zone2 = c(0L, 
    0L, 0L, 0L, 0L, 0L), zone3 = c(1L, 0L, 1L, 0L, 0L, 0L), zone4 = c(0L, 
    1L, 0L, 0L, 0L, 0L), lic1 = c(0L, 0L, 0L, 0L, 0L, 0L), lic2 = c(1L, 
    0L, 1L, 0L, 1L, 1L), lic3 = c(0L, 1L, 0L, 1L, 0L, 0L), enf1 = c(0L, 
    0L, 1L, 0L, 1L, 0L), enf2 = c(0L, 0L, 0L, 1L, 0L, 1L), enf3 = c(1L, 
    1L, 0L, 0L, 0L, 0L), chid = 1:6), row.names = c(4L, 5L, 7L, 
8L, 10L, 12L), class = "data.frame")

I have run into an error when running the code:

dfml <- dfidx(df, idx=list(c("chid", "respondent")), 
              choice="Alternative", varying=6:20, sep ="")

"Error in reshapeLong(data, idvar = idvar, timevar = timevar, varying = varying, : 'varying' arguments must be the same length"

I have check the data and each col from 6:20 is the same length, however, some respondents chose some of the options more than the others. Can someone possibly point out where I have gone wrong? It's my first attempt at analyzing choice experiment data.

aynber
  • 22,380
  • 8
  • 50
  • 63
Chris Bova
  • 134
  • 1
  • 2
  • 9

1 Answers1

1

The error means, that your price has five options, whereas the others, zone, lic, enf have less. dfidx obviously can't handle that. You need to provide them, at least as NA columns.

df <- transform(df, zone5=NA, lic4=NA, lic5=NA, enf4=NA, enf5=NA)

library(mlogit)

dfml <- dfidx(df, idx=list(c("chid","respondent")), choice="Alternative", 
              varying=grep('^price|^zone|^lic|^enf', names(df)), sep="")

dfml
# ~~~~~~~
#   first 10 observations out of 30 
# ~~~~~~~
#    Choice.Set Alternative code Choice price zone lic enf idx
# 1           4       FALSE    7      1     0    0   0   0 1:1
# 2           4        TRUE    7      1     0    0   1   0 1:2
# 3           4       FALSE    7      1     0    1   0   1 1:3
# 4           4       FALSE    7      1     1    0  NA  NA 1:4
# 5           4       FALSE    7      1     0   NA  NA  NA 1:5
# 6           5        TRUE    9      1     0    0   0   0 2:1
# 7           5       FALSE    9      1     1    0   0   0 2:2
# 8           5       FALSE    9      1     0    0   1   1 2:3
# 9           5       FALSE    9      1     0    1  NA  NA 2:4
# 10          5       FALSE    9      1     0   NA  NA  NA 2:5
# 
# ~~~ indexes ~~~~
#    chid respondent id2
# 1     1          1   1
# 2     1          1   2
# 3     1          1   3
# 4     1          1   4
# 5     1          1   5
# 6     2          1   1
# 7     2          1   2
# 8     2          1   3
# 9     2          1   4
# 10    2          1   5
# indexes:  1, 1, 2 

I use grep here to identify the varying= columns. Get rid of the habit of lazily specifying variables as numbers; it's dangerous since order might change easily with small changes in the script.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • That makes sense that not all attributes have the same number of levels. I get it now. The code works, however, I'm unable to view the output to ensure its in the format I will need in order to run the mlogit. error I get is >Error in `[.data.frame`(x, start:min(NROW(x), start + len)) : undefined columns selected – Chris Bova Nov 19 '22 at 20:29
  • @ChrisBova Provide the code that produces the error. – jay.sf Nov 19 '22 at 21:02
  • the code that gives an error is `view(dfml)` – Chris Bova Nov 20 '22 at 09:15
  • 1
    @ChrisBova No clue why `View` doesn't work with `"dfidx"` objects, since I prefer the console, I never used `View` anyway. You can try `View(as.data.frame(dfml))` but it won't show the indices, since `dfml` has two elements. Better just print the object by typing `dfml` as shown, also `head(dfml, 5)` works to show the first five lines of both elements. – jay.sf Nov 20 '22 at 09:26
  • That does allow me to see it now. Thank you. With regards to the NAs that we introduced in your first code, would it make sense to remove them from the dfidx? It seems to be throwing errors now when I run the mlogit. – Chris Bova Nov 20 '22 at 09:33