how to present results in Scipy.optimize.minimize

Question

If we consider this example from the documentation:

from scipy.optimize import minimize, rosen, rosen_der
x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
result = minimize(rosen, x0, method='Nelder-Mead', tol=1e-6)
print(result)

we get the following result

final_simplex: (array([[ 1.00000002,  1.00000002,  1.00000007,  1.00000015,  1.00000028],
       [ 0.99999999,  0.99999996,  0.99999994,  0.99999986,  0.99999971],
       [ 1.00000005,  1.00000007,  1.00000017,  1.00000031,  1.00000063],
       [ 1.00000004,  1.00000008,  1.00000013,  1.00000025,  1.00000047],
       [ 0.99999999,  0.99999996,  0.99999994,  0.99999984,  0.99999963],
       [ 1.00000005,  1.00000004,  1.00000003,  1.00000003,  1.00000004]]), array([  1.94206402e-13,   2.44964782e-13,   3.10422870e-13,
         3.37952410e-13,   5.52173609e-13,   7.16586838e-13]))
           fun: 1.9420640199868412e-13
       message: 'Optimization terminated successfully.'
          nfev: 494
           nit: 295
        status: 0
       success: True
             x: array([ 1.00000002,  1.00000002,  1.00000007,  1.00000015,  1.00000028])

as you can see, the number of iteration is 295. My question is how to get the value of x at each iteration?

Possible duplicate of [How to display progress of scipy.optimize function?](https://stackoverflow.com/questions/16739065/how-to-display-progress-of-scipy-optimize-function) — DavidW, Oct 08 '17 at 20:50
or https://stackoverflow.com/questions/27564436/print-currently-evaluated-params-during-scipy-minimization — DavidW, Oct 08 '17 at 20:51

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

The general way is to use a customized callback.

callback : callable, optional

Called after each iteration, as callback(xk), where xk is the current parameter vector.

Now using this is as easy as:

history = []
def callback(x):
    fobj = rosen(x)
    history.append(fobj)

result = minimize(rosen, x0, method='Nelder-Mead', tol=1e-6, callback=callback)
print(history)

But as mentioned in a comment in the first of those two links in the comments above (good links!), this approach is using extra function-evaluations! (the source of this obviously based on classic software-development design-principle questions: how much abstraction to use?)

Depending on your task, function-evaluations might be costly (ignore the following if not)!

In this case, you could use memoization to cache previously calculated values and don't recompute them all the time.

Now for at least your optimizer, 'Nelder-Mead', one of the gradient-free optimizers, you would need you to cache more than one value to be comparable to a no-callback solution (because somehow older values are needed). This probably depends on the minimizer-method chosen (and some internals).

scipy uses some memoization for gradients, when numerical-differentiation is used as this is very costly. But this code is only caching the last value, which is not working for your case (NM is different).

So we can build our own memoization-cache, function-based only (no gradients as none used):

import numpy as np
from scipy.optimize import minimize, rosen, rosen_der


""" Memoization of function-values -> don't recompute """
class Memoize(object):
    """ Modified from https://github.com/scipy/scipy/blob/c96c5294ca73586cadd6f4c10f26b6be5ed35045/scipy/optimize/optimize.py#L53 """
    def __init__(self, fun, n, cache_size=8):
        self.n = n
        self.c_n = cache_size
        self.fun = fun
        self.fobj = np.full((self.c_n), np.inf)
        self.x = np.full((self.c_n, self.n), np.inf)
        self.pos = 0  # circular-like buffer; next to replace = oldest

    def __call__(self, x, *args):
        # Check cache
        cands = np.all(x == self.x, axis=1).nonzero()[0]

        if cands.size:
            return self.fobj[cands]
        else:
            fobj = self.fun(x)
            self.fobj[self.pos] = fobj
            self.x[self.pos] = x
            self.pos = (self.pos + 1) % self.c_n

            return fobj

""" rosen-wrapper to check function-evaluations """
nfev = 0
def rosen_wrapper(x):
    global nfev
    nfev += 1
    return rosen(x)

x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
mem_rosen = Memoize(rosen_wrapper, len(x0))

""" Callback storing history """
history = []
def callback(x):
    fobj = mem_rosen(x)
    history.append(fobj)

result = minimize(mem_rosen, x0, method='Nelder-Mead', tol=1e-6, callback=callback)
print(result)
print('iteration fun(x)')
print(history[::50])  # every 50th
print(nfev)

Called with callback and cache-size 8:

final_simplex: (array([[ 1.00000002,  1.00000002,  1.00000007,  1.00000015,  1.00000028],
      [ 0.99999999,  0.99999996,  0.99999994,  0.99999986,  0.99999971],
      [ 1.00000005,  1.00000007,  1.00000017,  1.00000031,  1.00000063],
      [ 1.00000004,  1.00000008,  1.00000013,  1.00000025,  1.00000047],
      [ 0.99999999,  0.99999996,  0.99999994,  0.99999984,  0.99999963],
      [ 1.00000005,  1.00000004,  1.00000003,  1.00000003,  1.00000004]]), array([  1.94206402e-13,   2.44964782e-13,   3.10422870e-13,
        3.37952410e-13,   5.52173609e-13,   7.16586838e-13]))
          fun: 1.9420640199868412e-13
      message: 'Optimization terminated successfully.'
         nfev: 494
          nit: 295
       status: 0
      success: True
            x: array([ 1.00000002,  1.00000002,  1.00000007,  1.00000015,  1.00000028])
iteration fun(x)
[array([ 516.14978061]), array([ 1.16866125]), array([ 0.00135733]), array([  6.48182410e-05]), array([  1.03326372e-06]), array([  7.12094933e-10])]
504

Cache of 24 (not recommended; just for demo-purposes):

      nfev: 494
       nit: 295
    status: 0
   success: True
...
494

Now there is obviously a trade-off here, as we store a cache of size:

C_SIZE * N # x-vectors
C_SIZE * 1 # fun-values

And we compute a linear-operation on C_SIZE * N in each call.

If this pays off, and how to select the cache-size, is depending on your function, your minimizer and probably also your parameters.

Keep in mind, that the chosen approach is based on the idea: linear-amount of numpy-based calculations are probably faster than using logn (or similar) algorithms based on pure python!

(Memoization-code was not checked extensively!)

how to present results in Scipy.optimize.minimize

1 Answers1