3

Background

I have a function that takes a number of parameters and returns an error measure which I then want to minimize (using scipy.optimize.leastsq, but that is beside the point right now).

As a toy example, let's assume my function to optimize take the four parameters a,b,c,d:

def f(a,b,c,d):
    err = a*b - c*d
    return err

The optimizer then want a function with the signature func(x, *args) where x is the parameter vector.

That is, my function is currently written like:

def f_opt(x, *args):
    a,b,c,d = x
    err = a*b - c*d
    return err

But, now I want to do a number of experiments where I fix some parameters while keeping some parameters free in the optimization step.

I could of course do something like:

def f_ad_free(x, b, c):
    a, d = x
    return f(a,b,c,d)

But this will be cumbersome since I have over 10 parameters which means the combinations of different numbers of free-vs-fixed parameters will potentially be quite large.

First approach using dicts

One solution I had was to write my inner function f with keyword args instead of positional args and then wrap the solution like this:

def generate(func, all_param, fixed_param):
    param_dict = {k : None for k in all_param}
    free_param = [param for param in all_param if param not in fixed_param]
    def wrapped(x, *args):
        param_dict.update({k : v for k, v in zip(fixed_param, args)})
        param_dict.update({k : v for k, v in zip(free_param, x)})
        return func(**param_dict)
    return wrapped

Creating a function that fixes 'b' and 'c' then turns into the following:

all_params = ['a','b','c']
f_bc_fixed = generate(f_inner, all_params, ['b', 'c'])
a = 1
b = 2
c = 3
d = 4
f_bc_fixed((a,d), b, c)

Question time!

My question is whether anyone can think of a neater way solve this. Since the final function is going to be run in an optimization step I can't accept too much overhead for each function call. The time it takes to generate the optimization function is irrelevant.

Hannes Ovrén
  • 21,229
  • 9
  • 65
  • 75

2 Answers2

1

Your generate function is basically the same as functools.partial, which is what I would use here.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
  • I looked quickly at partial and did not really see how I could use it. Can you update your answer with some example code given the toy function above? – Hannes Ovrén Sep 11 '13 at 09:06
  • @kigurai I don't think `partial` works well for this because it seems that you need a wrapper that can effectively reorder arguments -- `partial` doesn't support that. – senderle Sep 11 '13 at 09:26
  • Yeah, that is pretty much what it looked like to me, and is the reason why I never investigated it. – Hannes Ovrén Sep 11 '13 at 16:00
1

I can think of several ways to avoid using a closure as you do above, though after doing some testing, I'm not sure either of these will be faster. One approach might be to skip the wrapper and just write a function that accepts

  1. A vector
  2. A list of free names
  3. A dictionary mapping names to values.

Then do something very like what you do above, but in the function itself:

def f(free_vals, free_names, params):
    params.update(zip(free_names, free_vals))
    err = params['a'] * params['b'] - params['c'] * params['d']
    return err

For code that uses variable names multiple times, make vars local up front, e.g.

a = params['a']
b = params['b']

and so on. This might seem cumbersome, but it has the advantage of making everything explicit, avoiding the kinds of namespace searches that could make closures slow.

Then pass a list of free names and a dictionary of fixed params via the args parameter to optimize.leastsq. (Note that the params dictionary is mutable, which means that there could be side effects in theory; but in this case it shouldn't matter because only the free params are being overwritten by update, so I omitted the copy step for the sake of speed.)

The main downsides of this approach are that it shifts some complexity into the call to optimize.leastsq, and it makes your code less reusable. A second approach avoids those problems though it might not be quite as fast: using a callable class.

class OptWrapper(object):
    def __init__(self, func, free_names, **fixed_params):
        self.func = func
        self.free_names = free_names
        self.params = fixed_params

    def __call__(self, x, *args):
        self.params.update(zip(self.free_names, x))
        return self.func(**self.params)

You can see that I simplified the parameter structure for __init__; the fixed params are passed here as keyword arguments, and the user must ensure that free_names and fixed_params don't have overlapping names. I think the simplicity is worth the tradeoff but you can easily enforce the separation between the two just as you did in your wrapper code.

I like this second approach best; it has the flexibility of your closure-based approach, but I find it more readable. All the names are in (or can be accessed through) the local namespace, which I thought that would speed things up -- but after some testing I think there's reason to believe that the closure approach will still be faster than this; accessing the __call__ method seems to add about 100 ns per call of overhead. I would strongly recommend testing if performance is a real issue.

senderle
  • 145,869
  • 36
  • 209
  • 233
  • The callable class is something that I thought of, but had no previous experience of. To me it looks quite similar to my wrapping example. Care to elaborate why it is better (except for being neater, of course)? – Hannes Ovrén Sep 11 '13 at 16:03
  • @kigurai, the theoretical advantage of a callable class is that all the names are resolved on the first try. When you refer to a variable inside a function, Python searches for the name in a series of namespaces. It first looks in the local namespace; then in the namespace of enclosing functions; and so on. (Details [here](http://stackoverflow.com/a/292502/577088).) Obviously it's fastest if the name is defined in the first place Python looks. Of course this isn't the only factor, which is why I would suggest testing. It's going to depend on what precisely you do. – senderle Sep 11 '13 at 18:11
  • Thanks! I ended up with the callable class. I modified it for positional arguments instead of keyword arguments though. I have not checked the relative speed of the different implementations though as the solution is fast enough for me. Thanks again for the help and explanations! – Hannes Ovrén Sep 12 '13 at 12:30
  • @kigurai, you might want to hold your thanks -- I just did a bit of testing, and given some toy example code, the closure was actually faster! I guess it has something to do with the overhead of accessing the `__call__` method. – senderle Sep 12 '13 at 15:28
  • Ah, good to know. It was fast enough for me, and was in my view a prettier way of doing it. I'll keep your answer as the accepted one though since I still think it helped me. – Hannes Ovrén Sep 16 '13 at 06:59