R mlogit model, computationally singular

Question

I've spent the whole of today first battling with formatting my data (updated after finding a bug via BondedDust's table(TM) suggestion) appropriately for mLogit:

raw <-read.csv("C:\\Users\\Andy\\Desktop\\research\\Oxford\\Prefs\\rData.csv", header=T, row.names = NULL,id="id")
raw <-na.omit(raw)

library(mlogit)

TM <- mlogit.data(raw, choice = "selected", shape = "long", alt.var = "dishId", chid.var = "individuals", drop.index = TRUE)

Where I fail is when trying to model my data.

model <- mlogit(selected ~ food + plate | sex + age +hand, data = TM)

Error in solve.default(H, g[!fixed]) : system is computationally singular: reciprocal condition number = 6.26659e-18

I would really appreicate some help on the topic. Afraid I'm going a little bananas with it.

The data itself is from an experiment where we get 1000s of people to decide between pairs of plates of food (we vary how the food looks - either Angular or Circular - and vary how the plate is shaped - is either Angular or Circular).

With best wishes, Andy.

PS Afraid I'm a newbie with statistic Qs on StackOverflow.

Use 'table' to see if you can identify the linear combination that is causing the problem. — IRTFM, Apr 24 '15 at 15:01
I am confused. What is your response here? `dishId` or `selected`? — Randy Lai, Apr 24 '15 at 18:04
@RandyLai, I want to know how the other factors (food, plate and sex [and other factors with less data]) influence Selected. It is more than possible that errors still exist in how I input the data. — andyw, Apr 24 '15 at 18:20
@BondedDust my csv did indeed contain 'cells' that contained no data, and via exploration I found a bug in my CrazyOutputFromTheseDevelopers.csv -> myData program. Afraid getting the same error now though! — andyw, Apr 24 '15 at 18:55
In `solve()`, use a smaller tolerance, like `solve(..., tol = 1e-20)`. This should be fine since you get `reciprocal condition number = 1.71139e-19`. More info in the [help file][1] and [this related question][2]. [1]: https://stat.ethz.ch/R-manual/R-devel/library/base/html/solve.html [2]: http://stackoverflow.com/questions/22134398/mahalonobis-distance-in-r-error-system-is-computationally-singular — Konstantinos, Mar 23 '16 at 18:25
@andyw, if you like that answer you got, please 'accept' it with the checkmark option! — DirtStats, Oct 06 '17 at 20:09

score 7 · Accepted Answer · answered Apr 29 '15 at 02:25

The model is unable to interpret your dishId as the alternative index (alt.var) because you have different keypairs for different choices. For example, you have "TS" and "RS" as alternative index keys for the first choice in your .csv file but you have "RR" and "RS" as keys for choice 3634. Additionally, you did also not specify the names of the alternatives (alt.levels). As a result of the fact that alt.levels is not filled in, mlogit.data will automatically try to detect the alternatives based upon the alternative index, which it cannot correctly interpret. This is basically where everything goes wrong: The 'food' and 'plate' variables are not interpreted as alternatives but they are considered as individual specific variables that eventually end up causing singularity issues.

You have two options to fix the issue. You can give the actual alternatives as input to mlogit.data through the alt.levels parameter:

TM <- mlogit.data(raw, choice = "selected", shape = "long", alt.levels = c("food","plate"),chid.var = "individuals",drop.index=TRUE)
model1 <- mlogit(selected ~ food + plate | sex + age +hand, data = TM)

Alternatively, you could opt to make your index keys consistent so that you can give them as input via alt.var. mlogit.data will now be able to correctly guess what your alternatives are:

raw[,3] <- rep(1:2,nrow(raw)/2) # use 1 and 2 as unique alternative keys for all choices
TM <- mlogit.data(raw, choice = "selected", shape = "long", alt.var="dishId", chid.var = "individuals")
model2 <- model <- mlogit(selected ~ food + plate | sex + age +hand, data = TM)

We verify that both models are indeed identical. The results of model 1:

> summary(model1)

Call:
mlogit(formula = selected ~ food + plate | sex + age + hand, 
    data = TM, method = "nr", print.level = 0)

Frequencies of alternatives:
   food   plate 
0.42847 0.57153 

nr method
4 iterations, 0h:0m:0s 
g'(-H)^-1g = 0.00423 
successive function values within tolerance limits 

Coefficients :
                    Estimate Std. Error t-value  Pr(>|t|)    
plate:(intercept) -0.0969627  0.0764117 -1.2689 0.2044589    
foodCirc           1.0374881  0.0339559 30.5540 < 2.2e-16 ***
plateCirc         -0.0064866  0.0524547 -0.1237 0.9015835    
plate:sexmale     -0.0811157  0.0416113 -1.9494 0.0512512 .  
plate:age16-34     0.1622542  0.0469167  3.4583 0.0005435 ***
plate:age35-54     0.0312484  0.0555634  0.5624 0.5738492    
plate:age55-74     0.0556696  0.0836248  0.6657 0.5055987    
plate:age75+       0.1057646  0.2453797  0.4310 0.6664508    
plate:handright   -0.0177260  0.0539510 -0.3286 0.7424902    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Log-Likelihood: -8284.6
McFadden R^2:  0.097398 
Likelihood ratio test : chisq = 1787.9 (p.value = < 2.22e-16)

Versus the results of model 2. Note that the alternatives are correctly identified, but the names are not explicitly added to the model:

> summary(model2)

Call:
mlogit(formula = selected ~ food + plate | sex + age + hand, 
    data = TM, method = "nr", print.level = 0)

Frequencies of alternatives:
      1       2 
0.42847 0.57153 

nr method
4 iterations, 0h:0m:0s 
g'(-H)^-1g = 0.00423 
successive function values within tolerance limits 

Coefficients :
                Estimate Std. Error t-value  Pr(>|t|)    
2:(intercept) -0.0969627  0.0764117 -1.2689 0.2044589    
foodCirc       1.0374881  0.0339559 30.5540 < 2.2e-16 ***
plateCirc     -0.0064866  0.0524547 -0.1237 0.9015835    
2:sexmale     -0.0811157  0.0416113 -1.9494 0.0512512 .  
2:age16-34     0.1622542  0.0469167  3.4583 0.0005435 ***
2:age35-54     0.0312484  0.0555634  0.5624 0.5738492    
2:age55-74     0.0556696  0.0836248  0.6657 0.5055987    
2:age75+       0.1057646  0.2453797  0.4310 0.6664508    
2:handright   -0.0177260  0.0539510 -0.3286 0.7424902    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Log-Likelihood: -8284.6
McFadden R^2:  0.097398 
Likelihood ratio test : chisq = 1787.9 (p.value = < 2.22e-16)

That is awesome! Thanks so much! I have been banging my head against this for over a week. — andyw, Apr 29 '15 at 11:52
@andyw, does this answer actually solve your question? You have four alternatives {RS, TS, RR, TR} corresponding to the four possible combinations of {Ciruclar, Angular} x {Food, Plate}. I'm not sure that the multinomial approach is appropriate here, since people only ever have two of the four alternatives to choose from? In particular, people are not choosing between "food" and "plate" in each trial! I think you might need to try 6 different binomial models for the 6 different choice-pairs your participants were offered {RS/TS, RS/RR, RS/TR, TS/RR, TS/TR, RR/TR}? — logworthy, Sep 14 '15 at 02:17
OR, you could maybe try 2 binomial models that consider each factor separately? i.e. one model to compare Circular food vs Angular food (combining choice sets {RS/TS, RR/TR}), and one model to compare Circular plates vs Angular plates (combining choice sets {RS/RR, TS/TR}). — logworthy, Sep 14 '15 at 02:19

score 0 · Answer 2 · answered Apr 26 '15 at 19:59

0

This is more a comment than an answer (I don't have anough rep point to comment!). However, I wasn't able to reproduce your code as there isn't any age column in your rData.csv.

answered Apr 26 '15 at 19:59

utobi

279
8
16

sorry about that. I had been working on a simpler model just in case the other variables were to blame. I've added age. – andyw Apr 27 '15 at 07:28
have you seen this post http://stackoverflow.com/questions/18978572/r-mlogit-on-my-data-giving-error-system-is-computationally-singular?rq=1 ? – utobi Apr 27 '15 at 18:42

R mlogit model, computationally singular

2 Answers2

Linked