1

I'm using sklearn.mixture.GMM to fit two Gaussian curves to an array of data and consequently overlay it with data histogram (dat disturbution is mixture of 2 Gaussian curves).

My data is a list of float number and here is the line of code i am using :

clf = mixture.GMM(n_components=1, covariance_type='diag')
clf.fit(listOffValues)  

if i set n_components to 1, I get the following error:

"(or increasing n_init) or check for degenerate data.") RuntimeError: EM algorithm was never able to compute a valid likelihood given initial parameters. Try different init parameters (or increasing n_init) or check for degenerate data.

and if i use n_components to 2 there error is:

(self.n_components, X.shape[0])) ValueError: GMM estimation with 2 components, but got only 1 samples.

For the first error, I tried changing all init parameters of GMM, but it didn't make any difference.

Tried an array of random numbers and the code is working perfectly fine. I cant figure out what possibly can be the issue.

Is there an implementation issue I'm overlooking?

Thank you for your help.

1 Answers1

0

If I understood you correctly - you would like to fit you data distribution with gaussians and you have only one feature per element. Than you should reshape your vector to be a column vector:

listOffValues = np.reshape(listOffValues, (-1, 1))

otherwise, if your listOffValues corresponds to some curve that you want to fit it with several gaussians, than you should use curve_fit. See Gaussian fit for Python

Community
  • 1
  • 1
aob
  • 197
  • 8