2

I am having trouble with mlogit() function. I am trying to predict which variables in a given set are the most preferred amongst people who took our survey. I am trying to predict the optimal combination of variables to create the most preferred option. Basically, we are measuring "Name", "Logo Size", "Design", "Theme","Flavor", and "Color".

To do this, we have a large data set and are trying to run it through mlogit.data() and mlogit(), although we keep getting the same error:

Error in if (abs(x - oldx) < ftol) { : missing value where TRUE/FALSE needed

None of my data is negative or missing, so this is very confusing. My syntax is:

#Process data in mlogit.data()

data2 <- 
  mlogit.data(data=data, choice="Choice", 
              shape="long", varying=5:10, 
              alt.levels=paste("pos",1:3))

#Make character columns factors and "choice" column (the one we are 
#measuring) a numeric.

data2$Name <- as.factor(data2$Name)
data2$Logo.Size <- as.factor(data2$Logo.Size)
data2$Design <- as.factor(data2$Design)
data2$Theme <- as.factor(data2$Theme)
data2$Color <- as.factor(data2$Color)
data2$Choice <- as.numeric(as.character(data2$Choice))

##### RUN MODEL ##### 
m1 <- mlogit(Choice ~ 0 + Name + Logo.Size + Design + Theme + Flavor 
+ Color, data = data2)

m1

Does it look like there is a problem with my syntax, or is it likely my data that is the problem?

Dharman
  • 30,962
  • 25
  • 85
  • 135
Andrew Colin
  • 155
  • 1
  • 11
  • Including a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) in your question will increase your chances of getting an answer. – Samuel Oct 19 '17 at 23:09

7 Answers7

2

In a panel setting, it is potentially the case that one or more of your choice cards does not have a TRUE value. One fix would be to drop choice cards that are missing a choice.

## Use data.table
library(data.table)

## Drop choice cards that received no choice
data.table[, full := sum(Choice), by=Choice_id]
data.table.full <- data.table[full!=0,]

This is an issue specific to mlogit(). For example, STATA's mixed logit approach ignores missing response variables, R views this as more of an issue that needs to be addressed.

bpar
  • 383
  • 1
  • 15
1

I had the same error. It got resolved when I arranged the data by unique ID and alternative ID. For some reason, mlogit requires all the choice instances to be stacked together.

Gaurav
  • 193
  • 1
  • 1
  • 12
0

Error in if (abs(x - oldx) < ftol) { : missing value where TRUE/FALSE needed

Suggests that if your response variable is binary ie 1/0 then one or more of the values is something other than 1/0

Look at: table(data2$Choice) to see if this is the case

cousin_pete
  • 578
  • 4
  • 15
0

I had similar issue, but eventually figured out. In my case, it is due to missing value of the covariates not the choice response.

SLi
  • 56
  • 4
0

I had this problem when my data included choice situations (questions that participants were asked) in which none of the choices was selected. Removing those rows fixed the problem.

  • 1
    Hi, welcome to SO! The OP has stated that: "None of my data is negative or missing". Also please provide justified reasons for deleting data - often this can't be recovered, so perhaps specify the details of how to do this with minimal impact, or some other justification. – Pranav Kasetti Aug 27 '19 at 18:37
0

Just in case others might have the same issue. I got this error when I did run my choice model (a maximum difference scaling) when I had partial missings. E.g. if two choices per task/set had to be made by the respondent, but only one choice was made.

I could solve this issue in the long format data set by dropping those observations that belonged to the missing choice while keeping the observations where a valid choise was made.

E.g. assume I have a survey with 9 tasks/sets and in each task/set 5 alternatives are provided. In each task my respondents had to make two choices, i.e. selecting one of the 5 alternatives as "most important" and one of the alternatives as "least important". This results in a data set that has 5*9*2 = 90 rows per respondent. There are exactly 5 rows per task*choice combination (e.g. 5 rows for task 1 containing the alternatives, where exactly one of these 5 rows is coded as 1 in the response variable in case it was chosen as the most (or least) important alternative).

Now imagine a respondent only provides a choice for "most important", but not for least important. In such a case the 5 rows for "least important" would all have a 0 in the response variable. Excluding these 5 rows from the data solves the aboove error issue and btw leads to the exact same results as other tools woudl provide (e.g. Sawtooth's Lighthouse software).

deschen
  • 10,012
  • 3
  • 27
  • 50
0

Re (1)

"data2 <-   mlogit.data(data=data, choice="Choice", 
          shape="long", varying=5:10, 
          **alt.levels=paste("pos",1:3))**"

and (2)

"m1 <- mlogit(**Choice** ~ 0 + Name + Logo.Size + Design + Theme + Flavor + Color, data = data2)"

In addition to making sure all of the data is filled in, I would just highlight that: (1) The level names need to exactly match the part of the variable name after the separator. And, (2) The DV in the model needs to be the variable name appearing before the separator.

Example: original variable "Media" with 5 categories -> 5 dummy variables "Med_Radio", "Med_TV", etc: The level names need to be "Radio", "TV", etc., exactly as written. And you must put "Med" into the model, not "Media", as DV.

This fixed the problem for me.

cmoez
  • 1
  • 1