-1

I am using a library that requires two functions as inputs for a method where these two functions are evaluated multiple times. For example

def the_func ( H, dH ):
    many_iterations = 10
    for i in xrange( many_iterations ):
       # internal calculations to yield a value of x
       x = np.random.randn(10) # say!
       foo = H(x)
       bar = dH(x)
       # do something with foo and bar e.g
       print foo,bar

However, calculating H and dH shares a lot of code, and evaluating each is expensive, so I have them calculated inside a single function that returns both. As an example, consider this function that returns two values, that correspond to H and dH above.

def my_func ( x ):
    # lots of calculations...
    return x.sum(), x/2.

Without changing the_func (which is coming from a library), I would like to still calculate only one run of my_func when the_func evaluates H and dH. At the moment, I'm solving the problem calling the_func as

the_func ( H=lambda x: my_func(x)[0], dH=lambda x: my_func(x)[1] )

This works fine, but for each iteration inside the_func, it needs to my_func the same function twice with exactly the same argument. I would like to evaluate this function only once per iteration, but without changing any of the_func.

Jose
  • 2,089
  • 2
  • 23
  • 29
  • I don't understand your question. What is the relationship between `the_func` and `my_func`? From what you've shown there is nothing to memoize because nothing is re-used. – BrenBarn Aug 03 '15 at 16:02
  • Are you trying to call `the_func` with the output of `my_func`? – Brobin Aug 03 '15 at 16:02
  • 1
    `y = my_func(...); the_func(*y)`? – chepner Aug 03 '15 at 16:04
  • How does your title relate to your question? How does memoization apply to either? Could you provide a less abstract example of what you're actually trying to achieve? – jonrsharpe Aug 03 '15 at 16:04
  • @BrenBarn Have made some changes, hope things are a bit clearer now. Thanks! – Jose Aug 03 '15 at 16:11
  • 2
    Why aren't you just doing `funcA, funcB = myfunc(x)`? – jonrsharpe Aug 03 '15 at 16:13
  • It's still not very clear. Why don't you just do `funcA, funcB = my_func(x)` and then `the_func(funcA, funcB)`? Or are you saying `my_func` calculates not `funcA` and `funcB` but the arguments that will be passed to them? – BrenBarn Aug 03 '15 at 16:13
  • @jonrsharpe Sorry, funcA and funcB are called many times inside the_func – Jose Aug 03 '15 at 16:21
  • @Jose what do you mean *"called"*?! They're just values, aren't they? Please cut the hand-waving and give a [mcve] – jonrsharpe Aug 03 '15 at 16:22
  • It doesn't matter how many times they're called in `the_func` as long as the only thing that's changing is their argument `x`. From your description, it sounds like the functions `funcA` and `funcB` themselves don't change during the course of `the_func`. So just get them from `my_func` and pass them into `the_func`. What is confusing is that you say that it is expensive to compute `funcA` and `funcB`, but in your code you only compute them once anyway (in `my_func`), so I don't see what you hope to achieve in terms of efficiency gains. – BrenBarn Aug 03 '15 at 16:27
  • So `my_func` is actually returning **two functions**? What is it doing with `x`? By `foo = funcA(x)` in `the_func` do you mean `foo = H(x)`? Again, an MCVE would be much more helpful than the current vague example. – jonrsharpe Aug 03 '15 at 16:34
  • @jonrsharpe Hopefully, the example is complete and explains what the problem is. I'm sorry for the confusing question. – Jose Aug 03 '15 at 17:10
  • Your latest edit has confused me even more, unfortunately. Now it appears that the two things `my_func` is returning are not functions at all. If `my_func` returns `x.sum()`, how do you expect to be able to use that as `H`? If you pass that in, you will be doing `x.sum()(blah)`, which doesn't make much sense. – BrenBarn Aug 03 '15 at 19:24
  • @BrenBarn That's why I'm finding it hard to find answers ;-) The point is that ``the_func`` requires two functions that will be evaluated internally, and with their result (``foo`` and ``bar``), some more calculations will be done. However, I calculate these two values in the same external function (``my_func``), and do not have the ability to change the internals of ``the_func``. – Jose Aug 04 '15 at 10:12
  • Can you condense your question into a simple but complete runnable example showing how the functions interact? It remains very unclear what kinds of objects your various arguments are even supposed to be. – BrenBarn Aug 04 '15 at 17:31

1 Answers1

0

so it occurred to me to use a memoization pattern, but given that x will be a numpy array (and thus unhashable), this needs some spelunking into the library code to make the array hashable that I would like to avoid.

So I still don't quite understand your question, just like everyone else, but if you simply want to memoize some function that takes an array argument, you may look at this question: Most efficient property to hash for numpy array tl;dr version: use tostring() method to compute the hash key of the underlying array, if the array is not too large. This may not port well due to endianness/layout/etc., so beware of dragons. For large arrays you may need to consider a real hash function that is fast, such as xxhash, and manage the hash collisions.

Now there's a nice library called cachetools that is useful for building a memoizer function. It actually provides its own implementation as handy decorators, but it doesn't handle custom hash function well. If you don't mind, you can look at my (admittedly makeshift) attempt to create method decorators for array-taking methods, on top of cachetools: https://github.com/congma/simplecache It may not be flexible or elegant but you got some ideas to build your own ;)

For your case you may do something like this, in pseudo-Python:

import simplecache  # that's my module
class ThingClass(simplecache.ArrayMethodCacheMixin, ...):
    ... ...
    @simplecache.memoized()
    def _common_code_for_both(self, x):   # x is array argument
        ... ...  # expensive computation using x
        return intermediate_state

    def h(self, x):
        intermediate_state = self._common_code_for_both(x)
        return do_something_with(x, intermediate_state)

    def dh(self, x):
        intermediate_state = self._common_code_for_both(x)
        return do_something_else_with(x, intermediate_state)

thing = ThingClass(...)

Now if thing.h() and thing.dh() are called in succession with the same argument, the intermediate computation won't be repeated. Now you just pass thing.h and thing.dh to the library function.

It ain't pretty but it may save some unnecessary evaluation.

Community
  • 1
  • 1
Cong Ma
  • 10,692
  • 3
  • 31
  • 47