2

I am playing around with scipy.optimize.root and I am trying to understand what it does and how it works.

My sample code is as follows:

import numpy as np
from scipy.optimize import root

def func(x):
     """ test function x + 2 * np.cos(x) = 0 """
     f = x + 2 * np.cos(x)
     print x,f
     return f

def main():
    sol = root(func, 0.3)

if __name__ == "__main__":
    main() 

With the print statement in func I get following output:

[ 0.3] [ 2.21067298]
[ 0.3] [ 2.21067298]
[ 0.3] [ 2.21067298]
[ 0.3] [ 2.21067298]
[-5.10560121] [-4.33928627]
[-1.52444136] [-1.43176461]
[-0.80729233] [ 0.57562174]
[-1.01293614] [ 0.0458079]
[-1.03071618] [-0.0023067]
[-1.02986377] [  7.49624786e-06]
[-1.02986653] [  1.20746968e-09]
[-1.02986653] [ -6.66133815e-16]

So far so good. I was now wondering why it calls with the initial value four times? Thank you very much.

Cong Ma
  • 10,692
  • 3
  • 31
  • 47
ThomasS
  • 31
  • 2
  • While I do not disagree with @Kanak, I just would not say _"they **are clearly** more than 8"_ – AGN Gazer Feb 14 '18 at 14:31
  • @AGNGazer. Yes. I'm not a native english speaker. It is energy consuming for me to be as clear as possible. Sorry for that. – keepAlive Feb 14 '18 at 14:34
  • You can get machine precision by computing `7./3 - 4./3 -1`. See [this](https://stackoverflow.com/questions/19141432/python-numpy-machine-epsilon). – keepAlive Feb 14 '18 at 14:56

3 Answers3

3

The root-finding routines first will call an input sanitizer which will call the function passed in at the initial value. (1st evaluation)

Then the default Powell root-finder (which is the one you're using) will call its internal MINPACK routine hybrd, which will evaluate once at the beginning (2nd evaluation at the initial value). Then hybrd calls fdjac1 to find the approximate Jacobian at this location. This requires two evaluations, one at the value itself (3rd time!), the other at a step ahead, which is the 4th call with a slightly different argument, as explained in Kanak's answer.

Edit: Duplicated callback evaluations could be highly undesirable when the cost of calling the function is high. If this is the case, it's possible to memoize the function so that repeated evaluations with the same input can be avoided, without opening the black box of the numerical routines. Memoization works fine with pure functions (i.e. ones without side-effects), which is often the case when with the numerical functions to be passed to a root-finding or minimization routine.

Cong Ma
  • 10,692
  • 3
  • 31
  • 47
2

They are actually not the same. A potential way to see that is to change the print options.

if __name__ == "__main__":
    np.set_printoptions(precision=15)
    main() 

Which henceforth outputs

[0.3] [2.210672978251212]
[0.3] [2.210672978251212]
[0.3] [2.210672978251212]
[0.300000004470348] [2.210672980079404]
...

It this case, we are lucky: the epsilonic change could have been still lower than our machine precision, and we would have see nothing.

Edit

The answer is in the fortran source code. By searching for "call fcn(n,x" in the source code, it looks like the function is:

  1. first evaluated at the starting point
  2. then called if requested by nprint>0 to enable printing of iterates.
  3. and finally called before termination.

Hence the 3 prints one sees.

If follows that printing of iterates is now "turned on", and the numerical (and printed) estimation of the Jacobian starts.

keepAlive
  • 6,369
  • 5
  • 24
  • 39
  • Allright, now we can see that initial value is repeated only three times. But the question still stands: why 3 times? – AGN Gazer Feb 14 '18 at 14:38
1

You did not provide jacobian information, so numerical-differentiation is used. Therefore the number of times func is called is higher than internal-iteration count.

This also means, that the first x-values are not the same, but they are near machine-precision. Changing numpy's printoptions to precision=15 is not enough to observe this!

Default-optimizer is this one and docs say:

eps : float

A suitable step length for the forward-difference approximation of the Jacobian (for fprime=None). If eps is less than the machine precision, it is assumed that the relative errors in the functions are of the order of the machine precision.

Edit: it seems i'm wrong!

print(x,f)
print('hash x: ', hash(x[0].item()))

Out:

[0.3] [2.21067298]
hash x:  691752902764108160
[0.3] [2.21067298]
hash x:  691752902764108160
[0.3] [2.21067298]
hash x:  691752902764108160
[0.3] [2.21067298]
hash x:  691752913072029696

Those seem indeed to be the same numbers (if there is no hidden-magic in hash)! One probably need to look at the internals if there is some setup-stuff needed which is not caching (some parts of scipy.optimize cache x-arguments).

Community
  • 1
  • 1
sascha
  • 32,238
  • 6
  • 68
  • 110
  • Why? It is simply showing than the precision needed to (discretely) figure out the change is superior to that of our machine precision. – keepAlive Feb 14 '18 at 15:04
  • @Kanak What? Hash will use all bytes of that float. If there is one bit different, it should be seen in the output (with high chance). I have to admit i did not check the hash-impl of floats, but in don't think they are rounding before or doing any of those things. – sascha Feb 14 '18 at 15:05
  • Yes. I kow that. This is what I am saying. Numbers really are the same. But in theory they are not. Increasing the machine precision would show that. – keepAlive Feb 14 '18 at 15:07
  • Maybe you got more insight of the internals than i do. But using an a-priori set eps, which results in a 1:1 bitcopy of the vector should be either cached or dynamically adjusted (there is no information gain). Now maybe there are reasons for that, but maybe the implementation is just missing this. – sascha Feb 14 '18 at 15:09