3

In a recent Hacker Newsletter issue, this very useful article about decorators in Python was linked. I like the article and I think I understand most of the decorator examples. However, in the non-decorator memoization example, I'm very confused by the code:

def memoize(fn):
    stored_results = {}

    def memoized(*args):
        try:
            # try to get the cached result
            return stored_results[args]
        except KeyError:
            # nothing was cached for those args. let's fix that.
            result = stored_results[args] = fn(*args)
            return result

    return memoized

I'm confused about how this function would create a persisting dictionary stored_results that gets appended to. After re-reading it, copy/pasting it into my editor and playing with it, and looking online for help, I still don't understand what the syntax stored_results[args] = fn(*args) is really doing.

(1) The article suggests that the above code will return the function, but that now it will search a dictionary first before executing on novel arguments. How does this happen? Why isn't stored_results just local to memoize? Why doesn't it get destroyed when memoized is returned?

(2) Links to other questions or web resources that explain the argument passing here with *args would be helpful too. If *args is a list of arguments, why can we use the syntax stored_results[args], when normally you get a non-hashable error when trying to index a dictionary on a list?

Thanks for any clarifying thoughts.

ely
  • 74,674
  • 34
  • 147
  • 228
  • Note that there is no reason to use this in modern Python; the standard library provides `functools.cache` for the purpose. – Karl Knechtel Aug 19 '22 at 11:50
  • @KarlKnechtel indeed the older `lru_cache` was already available long ago. This is about learning - I think your comment is a little too pedantic here. The question has nothing to do with using `cache` or `lru_cache` and readers don't need to be explained that. – ely Aug 30 '22 at 13:38
  • I am not sure who closed the question, but the proposed duplicate also seems incorrect. The proposed linked question was about *lambda* closures, while this one is specifically about function closures in the context of decorated functions. It is not generically about closures, rather specifically about the mechanics of using them for decorated functions. Definitely way overly pedantic to "close" this question 10 years later as a duplicate! – ely Aug 30 '22 at 13:42
  • "The proposed linked question was about lambda closures, while this one is specifically about function closures in the context of decorated functions" Lambda closures **are not different** from function closures, because lambdas are **just** a way to create functions. Implementation differences are minor, and in particular their closures work the same way. The "lambda function closures" question is *routinely* used to close questions about nested functions, because all of the same reasoning applies. The context of using a decorator is also not relevant. – Karl Knechtel Aug 31 '22 at 00:05
  • Further: closing questions as duplicates far into the future *is performing a valuable service* in curating the site. The point is to direct people who find a tangentially relevant question in a search engine, to the canonical which provides the necessary background information. – Karl Knechtel Aug 31 '22 at 00:07

1 Answers1

5

If *args is a list of arguments, why can we use the syntax stored_results[args], when normally you get a non-hashable error when trying to index a dictionary on a list?

Because it's not a list, it's a tuple. Lists are mutable, so you can't define a meaningful hash function on them. Tuples however are immutable data structures.

Why isn't stored_results just local to memoize? Why doesn't it get destroyed when memoized is returned?

Because memoize and memoized share the same name context (closure). The closure persists because memoized holds a reference to it and you return it and assign it to a global name (this is the effect of the decorator statement). With the closure, all the captured values persist as well.

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • 1
    So, `fn(*args)` pushes a tuple made of the elements of `args` into the function `fn()`. Does this mean that `stored_results` has an entry that is index by that *whole tuple*? In the link's example, they do this to a Fibonacci function. So if I call that function with a single integer argument, say `6`, will the dictionary be expanded to have an entry like `{(6):8, ...}`? If so, that clears up one question: it assumes the whole tuple full of `args` will be used as a single memoized key. It still leaves me wondering why `stored_results` persists across multiple calls to the new function. – ely Apr 20 '12 at 01:52
  • It seems like the interior `memoized` thing that gets defined only knows about `stored_results` by proxy. Does Python implicitly keep this object alive *because* the definition of the interior `memoized` refers to it? – ely Apr 20 '12 at 01:53
  • 2
    @EMS yes, the function memoized has a reference to this object so pythong will not garbage collect it. This kind of function with a "free-variable" is called a [closure](http://en.wikipedia.org/wiki/Closure_(computer_science)) – tobyodavies Apr 20 '12 at 01:54
  • That makes a lot more sense, but it seems rather hacky. It definitely reads in such a way that it feels like the code should create a scope error. `stored_results` isn't global and wasn't passed in to `memoized` as an argument. Is this specific to functions defined within other functions? Will subfunctions always know about parent functions' objects? – ely Apr 20 '12 at 01:58
  • 1
    I guess that was more to @tobyodavies comment: the concept of a closure makes sense, but it seems arbitrary that this particular function can see something that's non-local to it. Are there important conventions for closures in Python that one just has to memorize? (I'm Googling this now, so perhaps the answer is immediate and obvious, but I felt it was good to document it here anyway...) – ely Apr 20 '12 at 02:01
  • @EMS: I have to correct myself regarding the closure explanation. Please see my edited answer. – Niklas B. Apr 20 '12 at 02:07
  • Thanks for the clarification. I was aware of why tuples can be used as dictionary keys, I just mistakenly thought `*args` acted as a list. To the point about closures, suppose I created another variable right beneath the creation of `stored_results` that was just `x=0`. And then suppose that inside of `memoized` somewhere, I incremented `x` all the time, but never did anything with it. Then the object that gets created when I "make" `x` will persist inside of the returned function? Will it be inaccessible other than it being incremented? I can amend the above code if my question isn't clear. – ely Apr 20 '12 at 02:20
  • @EMS closures are very powerful tools. They look odd in python because variables are never explicitly declared, compared to JS or Lisp where it is more obvious where the variable is coming from. The rule python uses is called lexical scoping, that is it looks "out" from the definition point and uses the closest scope containing the desired variable name. In this sense there is nothing special about global variables --- this is just the highest level scope python can ever reach. – tobyodavies Apr 20 '12 at 02:24
  • Also, if either of you know good textbooks that explain more of this sort of formal software development and computer science material, please share. I studied pure math in school, and only learned theoretical computer science. Now in grad school, I do applied math and everything is about making libraries work together to implement math functions and larger algorithms with large data. Along the way I missed a lot of important stuff bridging the two domains. – ely Apr 20 '12 at 02:32
  • @EMS: Yes, in your theoretical example, `x` would be capture by the closure. If you can access it or not depends on the actual implementation (I don't think you can in Python). Also, it will not persist inside the returned function, but inside the closure that the returned function references (which is in fact the scope of the `memoize` method). As a mathematician, you might want to look at lambda calculus for the formal theory behind this. – Niklas B. Apr 20 '12 at 15:50
  • If you're interested in closures in general, I can recommend questions here on SO: [1](http://stackoverflow.com/questions/111102/how-do-javascript-closures-work) [2](http://stackoverflow.com/questions/233673/lexical-closures-in-python). – Niklas B. Apr 20 '12 at 15:50