1

So was playing around with memory_profiler module and noticed something weird. I was working on a function used to find nCr combinations using recursion. I made 2 version of of my function :

ver 1:

@profile
def c(n, r, memo={}):
    pair = (n, r)
    if pair in memo:
        return memo[pair]

    if r == 1:
        return n
    if n - r < r:
        return c(n, n - r)
    
    if r == 0 or r == n:
        return 1

    return  c(n - 1, r - 1) + c(n - 1, r)
    
print(c(20, 10))

ver 2:

@profile
def c(n, r, memo={}):
    pair = (n, r)
    if pair in memo:
        return memo[pair]

    if r == 1:
        return n
    if n - r < r:
        return c(n, n - r)
    
    if r == 0 or r == n:
        return 1

    memo[pair] = c(n - 1, r - 1) + c(n - 1, r)
    return memo[pair]

print(c(20, 10))

memory_profiler :

  1. ver 1 : enter image description here
  2. ver 2 :enter image description here

As you can see ver 1 and ver 2 are the same except for the last 2 lines of the function. But running memory_profiler on them they have great difference in performance. The thing I'm most puzzled by is although I forgot to pass memo in the function call it still results in such great performance improvement.

Each function call results in a new empty memo dict for that function. So why does line 8 in version 1 evaluates as False and fails to return whereas the same line in ver 2 evaluates True and returns 36 times. Shouldn't they both realistically be false as I'm trying to find key in an empty dictionary ? Shouldn't there be virtually no observable difference in performance ?

user3840170
  • 26,597
  • 4
  • 30
  • 62
Shaswat
  • 39
  • 4
  • Seems you fell into the trap of [mutable default argument](https://stackoverflow.com/q/1132941/1435475). Contrary to your expection this is not always empty: There is only one memo instance and it is updated over the whole recursion in case two. – guidot Dec 08 '20 at 16:19

1 Answers1

1

Your assumption that

Each function call results in a new empty memo dict for that function.

is incorrect. Unlike in JavaScript, in Python default parameter values are evaluated at definition time, once. If the c function is called without a value for the memo parameter, the instance of the dict constructed at definition time will be used as its value each time. At construction time it is empty, but subsequent invocations may mutate it. Observe the following:

def accumulate(a, b=[]):
    b.append(a)
    print(b)

accumulate(1)
accumulate('foo')
accumulate(None)

The above will print [1], then [1, 'foo'], then [1, 'foo', None].

Using default parameter values for memoisation is a pretty bad idea. Not only is it rather obscure and confusing, it also allows the caller to break the function by actually providing a value for the memo parameter. A better idea would be to use a global/non-local binding, or a decorator such as functools.cache or functools.lru_cache.

user3840170
  • 26,597
  • 4
  • 30
  • 62