0

I am trying to run the Mclust function (from mclust package) for a small data set (106x2). I am running the 3.2.1 R version for OS X 10.10.3. However, I am getting the following error:

Error in if (loglik > signif(.Machine$double.xmax, 6) || any(!c(scale,  : 
missing value where TRUE/FALSE needed
Called from: meEVV(data = data, z = z, prior = prior, control = control, warn = warn)

The data set has no missing data. Here it is,

4.2 5
4.2 6
4   5
4   5
4.2 5
4.4 5
3.9 5
4.2 5
3.9 6
4.4 7
4.9 6
4.1 5
4.1 5
4.9 6.5
3.9 5
4.7 5
5.1 5
5.2 6
4.8 6.5
5.2 5
4.5 5
5.1 5
4.2 5
4.4 5
4.1 5
4.4 5
4.2 5
5.1 5
6.1 5
4.2 5
4   5.5
4.2 5
5   5.5
4.2 5
3.9 5
3.9 5
4   5
4.7 5
3.9 5
5.3 5
4.4 5
4.4 5
4.3 5
4.7 5
4.6 6
4.8 5
4   5
4.3 5
3.6 5
4   5
4.1 5
3.8 5
3.9 5
5.2 5
4.7 5
3.9 5
4.8 5
4.9 5
5.7 6.5
5.4 5
5.4 6
4.3 5
3.8 5
4.8 5
4.8 6
3.9 5.5
3.9 5
5.3 5
5.5 7
4.4 5
3.8 5
4.3 7
4   5
4.9 5
4.4 5
4.8 5
3.7 5
3.9 6
4.7 5
3.8 5
4.5 6
3.9 5
4.8 5
5.1 5
5.3 5
4.5 5
5.3 5
4.5 5
5.1 5
3.7 5
5.4 5
4.2 5
4   5
4.6 5
4.6 5
4.7 5
4.3 6
4.3 5
4.3 6.5
4.1 5
4.5 5
4.4 5
3.7 5
3.8 5
3.5 5
4.4 5

Do you know how can I fix it? Thank you.

Paul
  • 111
  • 7
  • `dput` is a good way to provide a sample data set for your question. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – C8H10N4O2 Jul 01 '15 at 14:44
  • it works if you add a tiny bit of noise, `dat + rnorm(nrow(dat)*2, 0, 0.000001)` – Rorschach Jul 01 '15 at 14:49

2 Answers2

0

Variable 2 had only five levels.

This most likely causes numerical problems, due to the lack of variance in some subsets.

Most clustering algorithm really really need continuous data. (Steps of 0.5 aren't truly continuous. Scale the second value by 2, and you have integers 10,11,12,13,14 only, that is discrete.)

But I do not think this really is a clustering problem.

Instead, you may be attempting to do regression or some other prediction by clustering?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

the problem could exist with some of the models that mclust uses but not all of them. I suggest to try clustering with each model separately and compare the result for the ones that don't lead to this error.

for example just try EII and VII: Mclust(datazs[-13], modelNames = c("EII", "VII"))

Naghmeh
  • 1
  • 2