3

I'm currently trying to use scipy.optimize to find the parameters of a simulation which tries to fit some data. A created a function that gives the chi-square of my model over the data so that scipy.optimize will have to minimize that function.

One of the major issue I have the simulation, and thus the called function is very time consuming and I see that the method L-BFGS_B (or just BFGS for that matter) computes several times the value of the function at the exact same point!!! I don't understand why it would do that and it's killing me.

An example with a very simple function :

from scipy.optimize import minimize

def f3(x):
    print x
    return x[0]*x[0] + x[1]*x[1] + x[2]*x[2]

x0 = [3, -5, 7]

minimize(f3, x0, method = 'L-BFGS-B')

will return :

[ 3. -5.  7.]
[ 3. -5.  7.]
[ 3.00000001 -5.          7.        ]
[ 3.         -4.99999999  7.        ]
[ 3.         -5.          7.00000001]
[ 2.67070726 -4.45117871  6.23165016]
[ 2.67070726 -4.45117871  6.23165016]
[ 2.67070727 -4.45117871  6.23165016]
[ 2.67070726 -4.4511787   6.23165016]
[ 2.67070726 -4.45117871  6.23165017]
[ -1.72315050e-06   1.66152263e-06  -1.59989476e-06]
[ -1.72315050e-06   1.66152263e-06  -1.59989476e-06]
[ -1.71315050e-06   1.66152263e-06  -1.59989476e-06]
[ -1.72315050e-06   1.67152263e-06  -1.59989476e-06]
[ -1.72315050e-06   1.66152263e-06  -1.58989476e-06]
  status: 0
 success: True
    nfev: 15
     fun: 8.2895683293030033e-12
       x: array([ -1.72315050e-06,   1.66152263e-06,  -1.59989476e-06])
 message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
     jac: array([ -3.43630101e-06,   3.33304526e-06,  -3.18978951e-06])
     nit: 2

As you can see in the list of prints from the function calls, minimize call f3 several time at the same x.

This is frustrating because I feel like it is loosing a lot of time here.

If someone can enlighten me here, I'b so glad. Thanks.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
Tom
  • 51
  • 1

2 Answers2

3

It would do that because its not being as careful as you would like it to be. This deficiency has been added to the scipy bug tracker here. As I posted over there, you can work around this by caching the previous value yourself. Alternatively, you can use jac=True in your minimize call and write your function to return both the value at the point and the gradient. An example of the first approach is:

import numpy as np
from scipy import optimize

class CacheLast(object):
    def __init__(self, f):
        self.f = f
        self.last_x = None
        self.last_f = None
        self.ncall = 0

    def __call__(self, x):
        if np.all(x == self.last_x):
            return self.last_f
        else:
            self.last_x = x
            self.last_f = self.f(x)
            self.ncall += 1
            return self.last_f

def f3(x):
    return x[0]*x[0] + x[1]*x[1] + x[2]*x[2]

x0 = [3, -5, 7] 

func = CacheLast(f3)
res = optimize.minimize(func, x0, method='L-BFGS-B')

print 'total function calls: ', res.nfev
print 'actual function evals: ', func.ncall

Which gives:

total function calls:  15
actual function evals:  12
ewmoore
  • 31
  • 1
0

It's evaluating a finite-differences approximation of the gradient.

identity-m
  • 116
  • 1
  • 2
  • Thank you for your answer but I still wonder why it re-computes the exact same point and doesn't use the value it computed the step before! It would be much faster to re-use values already computed, in particular in the case of a time consuming function... – Tom Oct 15 '14 at 12:42