The general way is to use a customized callback.
callback : callable, optional
Called after each iteration, as callback(xk), where xk is the current parameter vector.
Now using this is as easy as:
history = []
def callback(x):
fobj = rosen(x)
history.append(fobj)
result = minimize(rosen, x0, method='Nelder-Mead', tol=1e-6, callback=callback)
print(history)
But as mentioned in a comment in the first of those two links in the comments above (good links!), this approach is using extra function-evaluations! (the source of this obviously based on classic software-development design-principle questions: how much abstraction to use?)
Depending on your task, function-evaluations might be costly (ignore the following if not)!
In this case, you could use memoization to cache previously calculated values and don't recompute them all the time.
Now for at least your optimizer, 'Nelder-Mead', one of the gradient-free optimizers, you would need you to cache more than one value to be comparable to a no-callback solution (because somehow older values are needed). This probably depends on the minimizer-method chosen (and some internals).
scipy uses some memoization for gradients, when numerical-differentiation is used as this is very costly. But this code is only caching the last value, which is not working for your case (NM is different).
So we can build our own memoization-cache, function-based only (no gradients as none used):
import numpy as np
from scipy.optimize import minimize, rosen, rosen_der
""" Memoization of function-values -> don't recompute """
class Memoize(object):
""" Modified from https://github.com/scipy/scipy/blob/c96c5294ca73586cadd6f4c10f26b6be5ed35045/scipy/optimize/optimize.py#L53 """
def __init__(self, fun, n, cache_size=8):
self.n = n
self.c_n = cache_size
self.fun = fun
self.fobj = np.full((self.c_n), np.inf)
self.x = np.full((self.c_n, self.n), np.inf)
self.pos = 0 # circular-like buffer; next to replace = oldest
def __call__(self, x, *args):
# Check cache
cands = np.all(x == self.x, axis=1).nonzero()[0]
if cands.size:
return self.fobj[cands]
else:
fobj = self.fun(x)
self.fobj[self.pos] = fobj
self.x[self.pos] = x
self.pos = (self.pos + 1) % self.c_n
return fobj
""" rosen-wrapper to check function-evaluations """
nfev = 0
def rosen_wrapper(x):
global nfev
nfev += 1
return rosen(x)
x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
mem_rosen = Memoize(rosen_wrapper, len(x0))
""" Callback storing history """
history = []
def callback(x):
fobj = mem_rosen(x)
history.append(fobj)
result = minimize(mem_rosen, x0, method='Nelder-Mead', tol=1e-6, callback=callback)
print(result)
print('iteration fun(x)')
print(history[::50]) # every 50th
print(nfev)
Called with callback and cache-size 8:
final_simplex: (array([[ 1.00000002, 1.00000002, 1.00000007, 1.00000015, 1.00000028],
[ 0.99999999, 0.99999996, 0.99999994, 0.99999986, 0.99999971],
[ 1.00000005, 1.00000007, 1.00000017, 1.00000031, 1.00000063],
[ 1.00000004, 1.00000008, 1.00000013, 1.00000025, 1.00000047],
[ 0.99999999, 0.99999996, 0.99999994, 0.99999984, 0.99999963],
[ 1.00000005, 1.00000004, 1.00000003, 1.00000003, 1.00000004]]), array([ 1.94206402e-13, 2.44964782e-13, 3.10422870e-13,
3.37952410e-13, 5.52173609e-13, 7.16586838e-13]))
fun: 1.9420640199868412e-13
message: 'Optimization terminated successfully.'
nfev: 494
nit: 295
status: 0
success: True
x: array([ 1.00000002, 1.00000002, 1.00000007, 1.00000015, 1.00000028])
iteration fun(x)
[array([ 516.14978061]), array([ 1.16866125]), array([ 0.00135733]), array([ 6.48182410e-05]), array([ 1.03326372e-06]), array([ 7.12094933e-10])]
504
Cache of 24 (not recommended; just for demo-purposes):
nfev: 494
nit: 295
status: 0
success: True
...
494
Now there is obviously a trade-off here, as we store a cache of size:
C_SIZE * N
# x-vectors
C_SIZE * 1
# fun-values
And we compute a linear-operation on C_SIZE * N
in each call.
If this pays off, and how to select the cache-size, is depending on your function, your minimizer and probably also your parameters.
Keep in mind, that the chosen approach is based on the idea: linear-amount of numpy-based calculations are probably faster than using logn (or similar) algorithms based on pure python!
(Memoization-code was not checked extensively!)