38

What's the recommended package for constrained non-linear optimization in python ?

The specific problem I'm trying to solve is this:

I have an unknown X (Nx1), I have M (Nx1) u vectors and M (NxN) s matrices.

max [5th percentile of (ui_T*X), i in 1 to M]
st 
0<=X<=1 and
[95th percentile of (X_T*si*X), i in 1 to M]<= constant

When I started out the problem I only had one point estimate for u and s and I was able to solve the problem above with cvxpy.

I realized that instead of one estimate for u and s, I had the entire distribution of values so I wanted to change my objective function so that I could use the entire distribution. The problem description above is my attempt to include that information in a meaningful way.

cvxpy cannot be used to solve this, I've tried scipy.optimize.anneal, but I can't seem to set bounds on the unknown values. I've looked at pulp too but it doesnt allow nonlinear constraints.

Rodrigo de Azevedo
  • 1,097
  • 9
  • 17
akhil
  • 839
  • 3
  • 8
  • 15
  • 1
    Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it. – jonrsharpe Feb 13 '14 at 21:26
  • 1
    Sure. When I started out the problem I only had one point estimate for u and s and I was able to solve the problem above with cvxpy. I realized that instead of one estimate for u and s, I had the entire distribution of values so I wanted to change my objective function so that I could use the entire distribution. The problem description above is my attempt to include that information in a meaningful way. cvxpy cannot be used to solve this, I've tried scipy.optimize.anneal, but I can't seem to set bounds on the unknown values. I've looked at pulp too but it doesnt allow nonlinear constraints. – akhil Feb 13 '14 at 21:52

4 Answers4

30

While the SLSQP algorithm in scipy.optimize.minimize is good, it has a bunch of limitations. The first of which is it's a QP solver, so it works will for equations that fit well into a quadratic programming paradigm. But what happens if you have functional constraints? Also, scipy.optimize.minimize is not a global optimizer, so you often need to start very close to the final results.

There is a constrained nonlinear optimization package (called mystic) that has been around for nearly as long as scipy.optimize itself -- I'd suggest it as the go-to for handling any general constrained nonlinear optimization.

For example, your problem, if I understand your pseudo-code, looks something like this:

import numpy as np

M = 10
N = 3
Q = 10
C = 10

# let's be lazy, and generate s and u randomly...
s = np.random.randint(-Q,Q, size=(M,N,N))
u = np.random.randint(-Q,Q, size=(M,N))

def percentile(p, x):
    x = np.sort(x)
    p = 0.01 * p * len(x)
    if int(p) != p:
        return x[int(np.floor(p))]
    p = int(p)
    return x[p:p+2].mean()

def objective(x, p=5): # inverted objective, to find the max
    return -1*percentile(p, [np.dot(np.atleast_2d(u[i]), x)[0] for i in range(0,M-1)])


def constraint(x, p=95, v=C): # 95%(xTsx) - v <= 0
    x = np.atleast_2d(x)
    return percentile(p, [np.dot(np.dot(x,s[i]),x.T)[0,0] for i in range(0,M-1)]) - v

bounds = [(0,1) for i in range(0,N)]

So, to handle your problem in mystic, you just need to specify the bounds and the constraints.

from mystic.penalty import quadratic_inequality
@quadratic_inequality(constraint, k=1e4)
def penalty(x):
  return 0.0

from mystic.solvers import diffev2
from mystic.monitors import VerboseMonitor
mon = VerboseMonitor(10)

result = diffev2(objective, x0=bounds, penalty=penalty, npop=10, gtol=200, \
                 disp=False, full_output=True, itermon=mon, maxiter=M*N*100)

print result[0]
print result[1]

The result looks something like this:

Generation 0 has Chi-Squared: -0.434718
Generation 10 has Chi-Squared: -1.733787
Generation 20 has Chi-Squared: -1.859787
Generation 30 has Chi-Squared: -1.860533
Generation 40 has Chi-Squared: -1.860533
Generation 50 has Chi-Squared: -1.860533
Generation 60 has Chi-Squared: -1.860533
Generation 70 has Chi-Squared: -1.860533
Generation 80 has Chi-Squared: -1.860533
Generation 90 has Chi-Squared: -1.860533
Generation 100 has Chi-Squared: -1.860533
Generation 110 has Chi-Squared: -1.860533
Generation 120 has Chi-Squared: -1.860533
Generation 130 has Chi-Squared: -1.860533
Generation 140 has Chi-Squared: -1.860533
Generation 150 has Chi-Squared: -1.860533
Generation 160 has Chi-Squared: -1.860533
Generation 170 has Chi-Squared: -1.860533
Generation 180 has Chi-Squared: -1.860533
Generation 190 has Chi-Squared: -1.860533
Generation 200 has Chi-Squared: -1.860533
Generation 210 has Chi-Squared: -1.860533
STOP("ChangeOverGeneration with {'tolerance': 0.005, 'generations': 200}")
[-0.17207128  0.73183465 -0.28218955]
-1.86053344078

mystic is very flexible, and can handle any type of constraints (e.g. equalities, inequalities) including symbolic and functional constraints. I specified the constraints as "penalties" above, which is the traditional way, in that they apply a penalty to the objective when the constraint is violated. mystic also provides nonlinear kernel transformations, which constrain solution space by reducing the space of valid solutions (i.e. by a spatial mapping or kernel transformation).

As an example, here's mystic solving a problem that breaks a lot of QP solvers, since the constraints are not in the form of a constraints matrix. It's optimizing the design of a pressure vessel.

"Pressure Vessel Design"

def objective(x):
    x0,x1,x2,x3 = x
    return 0.6224*x0*x2*x3 + 1.7781*x1*x2**2 + 3.1661*x0**2*x3 + 19.84*x0**2*x2

bounds = [(0,1e6)]*4
# with penalty='penalty' applied, solution is:
xs = [0.72759093, 0.35964857, 37.69901188, 240.0]
ys = 5804.3762083

from mystic.symbolic import generate_constraint, generate_solvers, simplify
from mystic.symbolic import generate_penalty, generate_conditions

equations = """
-x0 + 0.0193*x2 <= 0.0
-x1 + 0.00954*x2 <= 0.0
-pi*x2**2*x3 - (4/3.)*pi*x2**3 + 1296000.0 <= 0.0
x3 - 240.0 <= 0.0
"""
cf = generate_constraint(generate_solvers(simplify(equations)))
pf = generate_penalty(generate_conditions(equations), k=1e12)


if __name__ == '__main__':

    from mystic.solvers import diffev2
    from mystic.math import almostEqual
    from mystic.monitors import VerboseMonitor
    mon = VerboseMonitor(10)

    result = diffev2(objective, x0=bounds, bounds=bounds, constraints=cf, penalty=pf, \ 
                     npop=40, gtol=50, disp=False, full_output=True, itermon=mon)

    assert almostEqual(result[0], xs, rel=1e-2)
    assert almostEqual(result[1], ys, rel=1e-2)

Find this, and roughly 100 examples like it, here: https://github.com/uqfoundation/mystic.

I'm the author, so I am slightly biased. However, the bias is very slight. mystic is both mature and well-supported, and is unparalleled in capacity to solve hard constrained nonlinear optimization problems.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
  • More examples with nonlinear constraints here: http://stackoverflow.com/a/42299338/2379433 – Mike McKerns Apr 04 '17 at 14:19
  • Sounds promising, but the Project homepage linked from https://pypi.org/project/mystic/ (http://www.cacr.caltech.edu/~mmckerns) is broken! – feetwet Dec 05 '17 at 19:37
  • @feetwet: That page has been moved to http://trac.mystic.cacr.caltech.edu/project/mystic/wiki.html. The link on pypi will get updated in the upcoming release. – Mike McKerns Dec 10 '17 at 13:53
  • Am i missing something or should it be ** not * on this line ```return -1*percentile(p, [np.dot(np.atleast_2d(u[i]), x)[0] for i in range(0,M-1)])``` – deweydb Jan 04 '18 at 03:27
  • @deweydb: I don't understand what you are asking. Please clarify. – Mike McKerns Jan 10 '18 at 07:07
  • @Mike McKerns : I am trying to do an optimization to match the eigenvalues of a known 12 by 12 matrix with a symbolic matrix (same size) with 10 variables. I have posted a question here https://stackoverflow.com/questions/48873110/least-square-optimization-to-minimize-the-error-between-the-eigenvalues-of-one-k?noredirect=1#comment84750676_48873110. Can i do such optimization using mystic package – Paul Thomas Feb 20 '18 at 18:09
  • @MikeMcKerns Mystic is indeed spectacular! Do you have any guidance on how to pick the `k` parameter in specifying penalties? It seems rather tricky, as common sense suggests you have to match scale with the objective function. Any guidance? – ozataman Jul 05 '18 at 03:45
  • 1
    @ozataman: basically what I do to pick `k` is generally to start with a small value, and if it seems like the penalty is not being "respected", then I increase `k`. Yes, it's all about matching scale with the objective. Optimally, you want the penalty to be negligible compared to the objective everywhere except where the constraints are violated... there you want the penalty to dominate the objective. That's the best I can do. – Mike McKerns Jul 05 '18 at 11:41
  • While it's true that SLSQP has limitations, your pressure vessel example shows nicely how important scaling is: `constr[2]` swamps the others -- divide it by 1e4 or so, then scipy SLSQP (80 s fortran) gets to ~ your diffev2. Now, can mystic, can any tool, spot such crummy scaling ? in 4d, in 40d ? (Btw, `x0=bounds` ?) – denis Oct 28 '18 at 17:50
  • @denis: Yes, `mystic` has some tools for auto-dimensional collapse of redundant and irrelevant parameters -- it's built to collapse high-dimensional parameter sets by introspecting the solver history (see: `mystic.termination`). Also, per your comment about `x0=bounds`, in `mystic`, `x0` can either be a single starting point, or a randomly chosen value from a range. – Mike McKerns Oct 29 '18 at 10:58
  • For example, here's collapse that pegs parameters that hover around zero weight in a support vector machine example: https://github.com/uqfoundation/mystic/blob/master/examples3/test_svc2.py – Mike McKerns Oct 29 '18 at 11:05
17

scipy has a spectacular package for constrained non-linear optimization.

You can get started by reading the optimize doc, but here's an example with SLSQP:

minimize(func, [-1.0,1.0], args=(-1.0,), jac=func_deriv, constraints=cons, method='SLSQP', options={'disp': True})
Slater Victoroff
  • 21,376
  • 21
  • 85
  • 144
15

As others have commented as well, the SciPy minimize package is a good place to start. We also have a review of many other optimization packages in the Python Gekko paper (see Section 4). I've included an example below (Hock Schittkowski #71 benchmark) that includes an objective function, equality constraint, and inequality constraint in Scipy.optimize.minimize.

import numpy as np
from scipy.optimize import minimize

def objective(x):
    return x[0]*x[3]*(x[0]+x[1]+x[2])+x[2]

def constraint1(x):
    return x[0]*x[1]*x[2]*x[3]-25.0

def constraint2(x):
    sum_eq = 40.0
    for i in range(4):
        sum_eq = sum_eq - x[i]**2
    return sum_eq

# initial guesses
n = 4
x0 = np.zeros(n)
x0[0] = 1.0
x0[1] = 5.0
x0[2] = 5.0
x0[3] = 1.0

# show initial objective
print('Initial SSE Objective: ' + str(objective(x0)))

# optimize
b = (1.0,5.0)
bnds = (b, b, b, b)
con1 = {'type': 'ineq', 'fun': constraint1} 
con2 = {'type': 'eq', 'fun': constraint2}
cons = ([con1,con2])
solution = minimize(objective,x0,method='SLSQP',\
                    bounds=bnds,constraints=cons)
x = solution.x

# show final objective
print('Final SSE Objective: ' + str(objective(x)))

# print solution
print('Solution')
print('x1 = ' + str(x[0]))
print('x2 = ' + str(x[1]))
print('x3 = ' + str(x[2]))
print('x4 = ' + str(x[3]))

Here is the same problem with Python Gekko:

from gekko import GEKKO
m = GEKKO()
x1,x2,x3,x4 = m.Array(m.Var,4,lb=1,ub=5)
x1.value = 1; x2.value = 5; x3.value = 5; x4.value = 1

m.Equation(x1*x2*x3*x4>=25)
m.Equation(x1**2+x2**2+x3**2+x4**2==40)
m.Minimize(x1*x4*(x1+x2+x3)+x3)

m.solve(disp=False)
print(x1.value,x2.value,x3.value,x4.value)

There is also a more comprehensive discussion thread on nonlinear programming solvers for Python if SLSQP can't solve your problem. My course material on Engineering Design Optimization is available if you need additional information on the solver methods.

John Hedengren
  • 12,068
  • 1
  • 21
  • 25
  • Thank you for your answer, here in `def constraint2(x)` can you provide `sum_eq` to the constraint function without hardcoding it like `def constraint2(x, sum_eq)`? I am intending to use a dot product constraint like `np.linalg.dot(x,p) == 0` where `p` is a static vector that I calculate outside the optimisation routine. Thank you. – Zhubarb Dec 05 '17 at 15:04
  • You can do so by using partial functions like this `new_constraint = functools.partial(constraint2, p=p)` – Joonatan Samuel Aug 13 '18 at 09:24
  • You can use arrays in Gekko with `x=m.Array(m.Var,(4,3))` and `p=m.Array(m.Var,3)`. You can then include the dot product as `m.Equations(np.linalg.dot(x,p) == 0)`. There are examples here: https://github.com/BYU-PRISM/GEKKO/blob/master/examples/test_arrays.py and https://github.com/BYU-PRISM/GEKKO/blob/master/examples/test_matrix.py – John Hedengren Apr 20 '20 at 16:54
  • 1
    @JohnHedengren , hey man, GEKKO seems pretty awesome and has a very intuitive and simple API, which most optimization packages lack. Could you explain what are the main advantages of GEKKO over other packages like SciPy? – GitHunter0 Aug 06 '20 at 04:43
  • 1
    There are advantages and disadvantages to both. GEKKO has additional solution modes that go beyond Nonlinear Programming: https://gekko.readthedocs.io/en/latest/overview.html It uses automatic differentiation to provide exact 1st and 2nd derivatives in sparse form to solvers such as IPOPT and APOPT. It can be much faster for large-scale problems. The main disadvantage is that it can't use an external model function call. The equations must be written in Python. – John Hedengren Aug 06 '20 at 13:05
3

Typically for fitting you can use scipy.optimize functions, or lmfit which simply extends the scipy.optimize package to make it easier to pass things like bounds. Personally, I like using kmpfit, part of the kapteyn library and is based on the C implementation of MPFIT.

scipy.optimize.minimize() is probably the most easy to obtain and is commonly used.

pseudocubic
  • 1,039
  • 1
  • 14
  • 20
  • Thanks pseudocubic. This packages look great. I will try them out now. Just wondering if there is an easy way to do the matrix multiplication ? Or do I need to expand out each term ? – akhil Feb 13 '14 at 21:58
  • Actually my problem isnt a fitting problem. – akhil Feb 13 '14 at 22:53