1

My problem is that I can't find good examples for the effects of the argument initialization of the Mclust or densityMclust function? I like to optimize/change the fitting of the clusters and I hope that the argument initialization could be an option to do so. When I read the Package ‘mclust’ instruction from CRAN I find it hard to understand how I can optimize the fitting of a gaussian mixture.

Can someone please demonstrate the effects of hcPairs, subset and noise with an example that illustrates the effect and give a short explanation. If possible, separately for a better understanding.

Here is my code:

library(mclust)
set.seed(42)
dat <- c(rnorm(15000,50,2), rnorm(3000,52,1), rnorm(1000,55,2), rnorm(500,60,2), rnorm(50,60,4), rnorm(500,45,2), rnorm(250,40,2), rnorm(50,40,4), rnorm(4000,100,10))

set.seed(42)
mod <- densityMclust(dat, model = "V")
plot(mod, what = "density", data = dat, breaks = 100)

enter image description hereenter image description here

In this example the densityMclust function found 5 clusters. Is there a way to use the argument of initialization to change the clustering in a way that only two clusters will be found (one with a mean of around 50 containing 20350 datapoints and one with a mean of around 100 containing 4000 datapoints)?

I know that my request for help is complex, but I hope that someone can provide an example, which will make the use of the argument initialization clearer or just provide a clearer explanation compared to the one from CRAN. If only a part (only the effect of hcPairs, subset or noise) can be demonstrated, please feel free to do so. This will still be a great help for me. If my example isn't good for a demonstration, please feel also free to use any example you like.

Any help is highly appreciated!

Lisminjul
  • 83
  • 4

0 Answers0