5

I have a dataset, and I'd like to find a mixed gaussian model by least square error method.

The code is like this:

from sklearn.neighbors import KernelDensity
kde = KernelDensity().fit(sample)

def gaussian_2d(x,y,meanx,meany,sigx,sigy,rho):
    # rho <= 1
    part1 = 1/(2*np.pi*sigx*sigy*sqrt(1-0.5**2))
    part2 = -1/2*(1-rho**2)
    part3 = (((x-meanx)/sigx)**2-2*rho*(x-meanx)*(y-meany)/(sigx*sigy)+((y-meany)/sigy)**2)
    return part1*exp(part2*part3)

def square_error(f1,f2, u1,v1,sigu1,sigv1,rho1, u2,v2,sigu2,sigv2,rho2, u3,v3,sigu3,sigv3,rho3):
    # 1. Generate Mixed Gaussian Model
    def gaussian1(x,y):
        return gaussian_2d(x,y,u1,v1,sigu1,sigv1,rho1)
    def gaussian2(x,y):
        return gaussian_2d(x,y,u2,v2,sigu2,sigv2,rho2)
    def gaussian3(x,y):
        return gaussian_2d(x,y,u3,v3,sigu3,sigv3,rho3)
    mixed_model = f1*gaussian1(x,y)+f2*gaussian2(x,y)+(1-f1-f2)*gaussian3(x,y)
    # 2. Calculate the sum of square error
    sum_error = 0
    for point in sample:
        error = (exp(mixed_model(point)) - exp(kde.score(point)))**2
        sum_error += error
    return sum_error

# How can I add constraints:
# f1+f2 <= 1
# rho1,2,3 <= 1
result = sp.optimize.minimize(square_error)

But I don't know how to add constrains in minimize method. The example in http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize is very hard to understand.

UPDATE: This is what I end up with,

result = sp.optimize.minimize(
    square_error,
    x0 = [0.2,0.5,
          1,1,1,1,0.3,
          1,1,1,1,0.3,
          1,1,1,1,0.3,],
    bounds = [(0., 1.),(0., 1.),
              (None, None),(None, None),(0., None),(0., None),(0., 1.),
              (None, None),(None, None),(0., None),(0., None),(0., 1.),
              (None, None),(None, None),(0., None),(0., None),(0., 1.),])

But it gives me TypeError: square_error() takes exactly 17 arguments (1 given), what's the problem?

ZK Zhao
  • 19,885
  • 47
  • 132
  • 206
  • possible duplicate of [scipy minimize with constraints](http://stackoverflow.com/questions/20075714/scipy-minimize-with-constraints) – Daniel Lenz Sep 11 '15 at 12:19

1 Answers1

2

You can add only add bounds if the solver supports them, so only for method='L-BFGS-B', TNC and SLSQP.

The bounds are passed via a sequence of (min, max) tuples whose length corresponds to the number of your parameters. An example for fitting with 3 parameters would be:

result = sp.optimize.minimize(
                        square_error,
                        method='L-BFGS-B',
                        bounds=[(0., 5.), (None, 1.e4), (None, None)])

Here, None corresponds to no bound. I'm afraid that constraints on a combination of parameters such as f1+f2 <= 1 in your example is not possible within the framework of bounds in scipy.minimize.

You can, however, simply return np.inf in your cost function if your bounds are violated. I'm not sure about the stability of this, though:

def square_error(f1,f2, other_args):
    if f1+f2 <= 1:
        return np.inf
    # rest of the cost function

Moreover, I'd suggest to use the python implementation of multivariate Gaussians instead of creating them from scratch. That will speed up your fitting, help to avoid bugs and be more readable.

Daniel Lenz
  • 3,334
  • 17
  • 36
  • `I'm afraid that constraints on a combination of parameters such as f1+f2 <= 1 in your example is not possible.` This is not true. There are solvers especially for constrained optimization that support them. – cel Sep 11 '15 at 10:53
  • Could you provide a link? I was referring to the solvers that are included in `scipy.minimize`. – Daniel Lenz Sep 11 '15 at 11:27
  • 1
    Just have a look at the `constraints` keyword. OP is trying to define constraints rather than simple bounds. – cel Sep 11 '15 at 11:57
  • Thanks for pointing that out, I'm afraid I misunderstood the OP. I'll update my answer accordingly. Meanwhile, check out http://stackoverflow.com/questions/20075714/scipy-minimize-with-constraints for a great answer. – Daniel Lenz Sep 11 '15 at 12:17
  • For the `np.random.multivariate_normal`, I need the exact probability density value, and the module seems only provide `sample`? – ZK Zhao Sep 13 '15 at 11:27
  • So, in the end, its a mix of `constraints` and `boundry` settings? – ZK Zhao Sep 13 '15 at 11:29
  • Hi, I've update my code. It gives me `square_error() takes exactly 17 arguments (1 given)`, I don't know what's wrong, can you help me with this? – ZK Zhao Sep 14 '15 at 02:03