Implementing Gaussian mixture MLE using optim() in R

Question

I am trying to implement MLE for Gaussian mixtures in R using optim() using R's local datasets (Geyser from MASS). My code is below. The issue is that optim works fine however, returns the original parameters that I pass to it and also says that it has converged. I would appreciate if you can point out where I am going off track. My expectation is that it would at least yield different results if not wildly different.

library(ggplot2)
library(MASS)
data("geyser")
externaldata=geyser$waiting
x.vector=externaldata


MLE.x= function(combined.vector)
{ combined.vector=bigvec
  x.vector = externaldata
  K = k #capital K inside this MLE function, small K defined in the global environment
  prob.vector = combined.vector[1:K] 
  mu.vector =combined.vector[(K+1):((2*K))]
  sd.vector=combined.vector[(2*K+1):(3*K)]
  prob=matrix(rep(prob.vector,length(x.vector)),byrow=TRUE,nrow = length(x.vector))
  mu.sigma=cbind(mu.vector,sd.vector)
  x.by.K=matrix(nrow = length(x.vector), ncol = k)
  for (i in 1:K){
    x.by.K[,i]=dnorm(x.vector,mu.sigma[i,1],mu.sigma[i,2])
  }
  prob.mat=x.by.K*prob
  density=apply(prob.mat,1,sum)
  log.density=sum(-log(density))
  return(log.density)
}



## k=2 set ##
meanvec=c(50,80)
sigmavec=c(5,5)
k=2
probvec=c(1/3,2/3)
bigvec=c(probvec,meanvec,sigmavec)
est.k2.MLE=MLE.x(bigvec)
z=optim(bigvec,
        fn=MLE.x,
        method = "L-BFGS-B")
z


#### k=3 set #####
meanvec=c(50,70,80)
sigmavec=c(5,5,5)
k=3
probvec=rep(1/3,3)
bigvec=c(probvec,meanvec,sigmavec)
est.k3.MLE=MLE.x(bigvec)
z=optim(bigvec,
        fn=MLE.x,
        method = "BFGS")
z

Any reason why you are not using the argument you pass to MLE.x? (First line of MLE.x) — asachet, Aug 19 '15 at 09:19
Also, it is considered bad form to use `=` for assignment, although [the pros and cons are really not that clear](http://stackoverflow.com/questions/1741820/assignment-operators-in-r-and). IMO it's fine to use `=` but I'm just making sure it's a choice on your part. — asachet, Aug 19 '15 at 09:23
Yes I would recommend to use `<-`, alas my only argument is: that's what everybody does and what you will see absolutely everywhere. Also, note that using "return" is optional as the last value computed will be returned anyway. But it may make the code clearer. — asachet, Aug 19 '15 at 10:17

asachet · Accepted Answer · 2015-08-19T10:13:59.207

Remove the first line of the MLE.x function.

It will always return the same thing since its argument is replaced by the global variable "bigvec". So the MLE cannot converge, I think you rather hit the maximum iteration. You can check this by accessing z$convergence where z is the value returned by optim. This will be an integer code. 0 means that all is well, 1 indicates that the maximum number of iteration has been reached. Other values are different error codes.

But the code still doesn't run properly as you point out in comment. I couldn't see any mistake so I added the following snippet at the end of MLE.x:

if(any(is.na(density))) {
    browser()
  } else {
    log.density
  }

What it does is, if there are some NA (or NaN), we call browser() which is an insanely convenient debugging tool: it stops the code and brings up the console so that we can explore the environment. Else we return log.density.

Then I ran the code and, behold, when there is a NA in density, instead of failing, it now brings up the console:

You can see:

Browse[1]> head(x.by.K)
     [,1]       [,2]
[1,]  NaN 0.01032407
[2,]  NaN 0.01152576
[3,]  NaN 0.01183521
[4,]  NaN 0.01032407
[5,]  NaN 0.01107446
[6,]  NaN 0.01079706

The first column of x.by.K is NaN... So dnorm returns NaN...

Browse[1]> mu.sigma
     mu.vector sd.vector
[1,]  64.70180 -20.13726
[2,]  61.89559  33.34679

Here is the problem: SD of -20, that can't be good...

Browse[1]> combined.vector
[1] 1267.90677 1663.42604   64.70180   61.89559  -20.13726   33.34679

But this is the input of MLE.x.

There, I've just showed you how I debug my code :)

So what's happening is that during the optimization routine, parameters 5 and 6 take negative values, which causes dnorm to fail. Why wouldn't they be negative? Optim doesn't know that these are supposed to stay positive!

So you must find a way to do a constrained optimization whith the constraint that SD>0.

But you actually shouldn't do that and rather think about what you want to do because I am not quite sure why you want to fit univariate Gaussian.

Ran the code with the changes you suggested. It is now coming up with: `Warning messages: 1: In dnorm(x.vector, mu.sigma[i, 1], mu.sigma[i, 2]) : NaNs produced z $par [1] 16146.894787 10919.923359 81.029617 54.062756 6.818465 5.615605 $value [1] -1888.043 $counts function gradient 130 100 $convergence [1] 1 $message NULL` — user2007598, Aug 19 '15 at 09:33
the probabilities are over 1 and the MLE is now negative. Any clues? — user2007598, Aug 19 '15 at 09:34

Implementing Gaussian mixture MLE using optim() in R

1 Answers1