0

I have a data frame with several thousand rows and only a few columns. I have pasted a portion of my code below. I am looking at the "Section" preferences of individuals and want to know what might be influencing their choice (e.g., density, area). I successfully formatted the data into long format using mlogit.data but I'm having trouble executing the mlogit function in R which is the final step.

Below is my data frame

   > df[1:30,]
    Date      Indiv  Sections Choices  Density    Area
1  8/1/2017     1      A     Yes       0.13       21.7
2  8/1/2017     1      B      No       0.29       12.2
3  8/1/2017     1      C      No       0.23       7.5 
4  8/1/2017     1      D      No       0.05       3.7
5  8/1/2017     1      E      No       0.31       29.3
6  8/1/2017     2      A      No       0.13       21.7 
7  8/1/2017     2      B      No       0.29       12.2 
8  8/1/2017     2      C      Yes      0.23       7.5 
9  8/1/2017     2      D      No       0.05       3.7
10 8/1/2017     2      E      No       0.31       29.3
11 8/1/2017     3      A      No       0.13       21.7
12 8/1/2017     3      B      Yes      0.29       12.2
13 8/1/2017     3      C      No       0.23       7.5
14 8/1/2017     3      D      No       0.05       3.7
15 8/1/2017     3      E      No       0.31       29.3
16 8/2/2017     1      A      No       0.19       21.7
17 8/2/2017     1      B      No       0.27       12.2
18 8/2/2017     1      C      Yes      0.43       7.5
19 8/2/2017     1      D      No       0.11       3.7
20 8/2/2017     1      E      No       0.47       29.3
21 8/2/2017     2      A      No       0.19       21.7
22 8/2/2017     2      B      No       0.27       12.2
23 8/2/2017     2      C      No       0.43       7.5
24 8/2/2017     2      D      No       0.11       3.7
25 8/2/2017     2      E      Yes      0.47       29.3
26 8/2/2017     2      A      No       0.19       21.7
27 8/2/2017     3      B      No       0.27       12.2
28 8/2/2017     3      C      No       0.43       7.5
29 8/2/2017     3      D      No       0.11       3.7
30 8/2/2017     3      E      Yes      0.47       29.3

After I run the mlogit.data function with the raw data I get this:

    Date      Indiv  Sections  Choices   Density     Area
A  8/1/2017     1      A      TRUE        0.13       21.7
B  8/1/2017     1      B      FALSE       0.29       12.2
C  8/1/2017     1      C      FALSE       0.23       7.5 
D  8/1/2017     1      D      FALSE       0.05       3.7
E  8/1/2017     1      E      FALSE       0.31       29.3
A  8/1/2017     2      A      FALSE       0.13       21.7 
B  8/1/2017     2      B      FALSE       0.29       12.2 
C  8/1/2017     2      C      TRUE        0.23       7.5 
D  8/1/2017     2      D      FALSE       0.05       3.7
E  8/1/2017     2      E      FALSE       0.31       29.3
A  8/1/2017     3      A      FALSE       0.13       21.7
B  8/1/2017     3      B      TRUE        0.29       12.2
C  8/1/2017     3      C      FALSE       0.23       7.5
D  8/1/2017     3      D      TRUE        0.05       3.7
E  8/1/2017     3      E      FALSE       0.31       29.3
A  8/2/2017     1      A      FALSE       0.19       21.7
B  8/2/2017     1      B      FALSE       0.27       12.2
C  8/2/2017     1      C      TRUE        0.43       7.5
D  8/2/2017     1      D      FALSE       0.11       3.7
E  8/2/2017     1      E      FALSE       0.47       29.3
A  8/2/2017     2      A      FALSE       0.19       21.7
B  8/2/2017     2      B      FALSE       0.27       12.2
C  8/2/2017     2      C      FALSE       0.43       7.5
D  8/2/2017     2      D      FALSE       0.11       3.7
E  8/2/2017     2      E      TRUE        0.47       29.3
A  8/2/2017     3      A      FALSE       0.19       21.7
B  8/2/2017     3      B      FALSE       0.27       12.2
C  8/2/2017     3      C      FALSE       0.43       7.5
D  8/2/2017     3      D      FALSE       0.11       3.7
E  8/2/2017     3      E      TRUE        0.47       29.3

Below is my mlogit syntax in R:

ML <- mlogit(Choice ~ Density + Area, data = df, method="nr")

Below is the error message:

Error in solve.default(H, g[!fixed]) : 
  Lapack routine dgesv: system is exactly singular: U[5,5] = 0

I've spent several hours and days modifying the code and researching the issue, but still can't make it run. I would very like to know what I am doing incorrectly and get some guidance on how to make the mlogit function work with my data.

Thank you very much for your help on this.

crew4u
  • 45
  • 9
  • Singularity in regression procedures are always due to data issues and you have provided very little. The summary function in the base package and dput might clarify. The fact that you have dates in character or factor class might be relevant if you have multiple observations in individuals. – IRTFM Oct 03 '20 at 01:32

1 Answers1

0

From what I think your data are, I had this problem too.

It seems to me that density and area are both alternative-specific variables that do not vary across individuals (although they do vary by time within alternative). So I think you need alternative-specific generic coefficients. But, if you do not have any alternative-specific variables that DO vary across individuals, you don't have enough varying terms to model with intercepts. SO, run your model without intercepts:

ML <- mlogit(Choice ~ Density + Area + 0, data = df, method="nr")

... and hopefully it should work.

See the short discussion when I asked about this for the clues about not-enough-terms and the vignette quote: R: Can I analyze non-varying-across-individual alternative-specific attribute variables with mlogit?

JOgawa
  • 23
  • 3