Fitting multimodal distrubtions

Question

Let's assume we're having a linear combination of two normal distributions. I think one would call the result a multimodal distribution.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

ls = np.linspace(0, 60, 1000)

distribution = norm.pdf(ls, 0, 5) + norm.pdf(ls, 20, 10)
distribution = (distribution * 1000).astype(int)
distribution = distribution/distribution.sum()

plt.plot(ls, distribution)

As you can see, we are having a linear combination of two normal distributions having parameters (mu1 = 0, s1 = 5) and (mu2 = 20, s2 = 10). But of course, we usually do not know these parameters beforehand.

I would like to know how I can estimate or fit those parameters (mus and sigmas). I am confident there are methods that would allow to do this but I couldn't find any yet.

What are the parameters that you want to fit? In what form does the data arise? (sample values, histogram, etc) — Bill Bell, Jul 05 '17 at 20:19

Miriam Farber · Accepted Answer · 2017-07-05T20:45:07.850

3

The problem that you describe is a special case of Gaussian Mixture model. In order to be able to estimate these parameters, you need to have some samples. If you don't have samples but you are given the curve, you could produce some samples based on the curve. Then you can use Expectation–maximization algorithm to estimate the parameters. Scikit-learn has a method that enables you to do that: sklearn.mixture.GaussianMixture. You just need to provide your samples, the number of components (n_components) which is 2 in your case, and a covariance type, which would be full in your case, as you have no prior assumptions on the covariance matrix.

edited Jul 05 '17 at 20:45

answered Jul 05 '17 at 20:39

Miriam Farber

18,986
14
61
76

Ah! I knew that I should already know that! I was looking for maximum likelihood methods but somehow I failed to look that up! Thanks, this should work :) – Stefan Falk Jul 05 '17 at 20:43
Hey :) I am having a follow up question to my original one. Maybe you [want to take a look](https://stats.stackexchange.com/questions/289490/how-can-i-model-such-a-distribution-consisting-of-a-mix-of-different-distributio) at it? – Stefan Falk Jul 08 '17 at 11:31

score 2 · Answer 2 · answered Jul 05 '17 at 20:40

2

You might want to use the Expectation Maximization algorithm.

It is an iterative approach that allows you to fit a model of mixture components. There is a very convenient implementation in scikit-learn: GaussianMixture

I found it hard to figure out how to structure the data for this algorithm to work, so I set up a sample for you: https://nbviewer.jupyter.org/gist/lhk/e566e2d6b67992eca062f9d96e2a14a2

answered Jul 05 '17 at 20:40

lhk

27,458
30
122
201

Is there a chance you can [help out here](https://stats.stackexchange.com/questions/289490/how-can-i-model-such-a-distribution-consisting-of-a-mix-of-different-distributio) as well? The question is a follow up from this one. I didn't consider "border" cases. – Stefan Falk Jul 08 '17 at 11:32

Fitting multimodal distrubtions

2 Answers2

Linked