47

Suppose a function with a mutable default argument:

def f(l=[]):
    l.append(len(l))
    return l

If I run this:

def f(l=[]):
    l.append(len(l))
    return l
print(f()+["-"]+f()+["-"]+f()) # -> [0, '-', 0, 1, '-', 0, 1, 2]

Or this:

def f(l=[]):
    l.append(len(l))
    return l
print(f()+f()+f()) # -> [0, 1, 0, 1, 0, 1, 2]

Instead of the following one, which would be more logical:

print(f()+f()+f()) # -> [0, 0, 1, 0, 1, 2]

Why?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Benoît P
  • 3,179
  • 13
  • 31

2 Answers2

53

That's actually pretty interesting!

As we know, the list l in the function definition is initialized only once at the definition of this function, and for all invocations of this function, there will be exactly one copy of this list. Now, the function modifies this list, which means that multiple calls to this function will modify the exact same object multiple times. This is the first important part.

Now, consider the expression that adds these lists:

f()+f()+f()

According to the laws of operator precedence, this is equivalent to the following:

(f() + f()) + f()

...which is exactly the same as this:

temp1 = f() + f() # (1)
temp2 = temp1 + f() # (2)

This is the second important part.

Addition of lists produces a new object, without modifying any of its arguments. This is the third important part.

Now let's combine what we know together.

In line 1 above, the first call returns [0], as you'd expect. The second call returns [0, 1], as you'd expect. Oh, wait! The function will return the exact same object (not its copy!) over and over again, after modifying it! This means that the object that the first call returned has now changed to become [0, 1] as well! And that's why temp1 == [0, 1] + [0, 1].

The result of addition, however, is a completely new object, so [0, 1, 0, 1] + f() is the same as [0, 1, 0, 1] + [0, 1, 2]. Note that the second list is, again, exactly what you'd expect your function to return. The same thing happens when you add f() + ["-"]: this creates a new list object, so that any other calls to f won't interfere with it.

You can reproduce this by concatenating the results of two function calls:

>>> f() + f()
[0, 1, 0, 1]
>>> f() + f()
[0, 1, 2, 3, 0, 1, 2, 3]
>>> f() + f()
[0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5]

Again, you can do all that because you're concatenating references to the same object.

ForceBru
  • 43,482
  • 10
  • 63
  • 98
  • 1
    Wow, . The best thing about this answer is that it is right. `[]+f()+f()` gives `[0, 0, 1]` and `f()+f()+[]` gives `[0, 1, 0, 1]` !!! – Benoît P Aug 21 '19 at 14:20
  • it's just how the python print process the functional call, either preprocess or post processs – sahasrara62 Aug 21 '19 at 14:21
  • 6
    The new object returned by `+` is definitely the key here, it's so subtle – C.Nivs Aug 21 '19 at 14:22
  • 3
    A good reminder for why a mutable default argument is a big no-no. – EliadL Aug 21 '19 at 14:23
  • @BenoîtPilatte, yep! I've also edited my answer to clarify the part about the result of addition being a new object. – ForceBru Aug 21 '19 at 14:23
  • 2
    Looked into `dis(lambda: f()+f()+f())` and the function `f` did get called twice before `add` is performed. Great answer btw. – Henry Yik Aug 21 '19 at 14:41
  • 3
    @HenryYik it's because of how `list.__add__` is implemented. The first `f()` instantiates the list object, let's call it `l1`. Then `l1.__add__(f())` is called, so, `f()` needs to be evaluated *first*, which changes the reference that is shared with `l1`. Then `l1.__add__(l2)` finishes, returning the new object. – C.Nivs Aug 21 '19 at 15:10
  • 1
    @C.Nivs that was actually much easier to understand than operator precedence! – Henry Yik Aug 21 '19 at 15:18
  • @EliadL You could conceivably use a mutable default as some sort of cache on a function to keep track of calls. Now, this would probably be better suited for a `class` or `lru` cache, but as a naive or small example, I wouldn't say it's a *horrible* idea – C.Nivs Aug 27 '19 at 20:38
  • 1
    Thank you for describing this by breaking down the expression into separate statements. That's generally my preferred way of explaining complex expressions like this, and I find that it really aids understanding. – Barmar Aug 27 '19 at 22:27
5

Here's a way to think about it that might help it make sense:

A function is a data structure. You create one with a def block, much the same way as you create a type with a class block or you create a list with square brackets.

The most interesting part of that data structure is the code that gets run when the function is called, but the default arguments are also part of it! In fact, you can inspect both the code and the default arguments from Python, via attributes on the function:

>>> def foo(a=1): pass
... 
>>> dir(foo)
['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', ...]
>>> foo.__code__
<code object foo at 0x7f114752a660, file "<stdin>", line 1>
>>> foo.__defaults__
(1,)

(A much nicer interface for this is inspect.signature, but all it does is examine those attributes.)

So the reason that this modifies the list:

def f(l=[]):
    l.append(len(l))
    return l

is exactly the same reason that this also modifies the list:

f = dict(l=[])
f['l'].append(len(f['l']))

In both cases, you're mutating a list that belongs to some parent structure, so the change will naturally be visible in the parent as well.


Note that this is a design decision that Python specifically made, and it's not inherently necessary in a language. JavaScript recently learned about default arguments, but it treats them as expressions to be re-evaluated anew on each call — essentially, each default argument is its own tiny function. The advantage is that JS doesn't have this gotcha, but the drawback is that you can't meaningfully inspect the defaults the way you can in Python.

Eevee
  • 47,412
  • 11
  • 95
  • 127
  • 1
    I think the OP is generally aware of the "mutable default argument" issue and why it happens. This question is more complicated. – Barmar Aug 27 '19 at 22:28
  • The general problem of mutable default arguments is explained in [this question](https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument) – Barmar Aug 27 '19 at 22:29
  • I think this answer should stay there or be fused to the accepted answer as it contains some complementary information that can be useful to understand the accepted answer and the question in the first place. (or at least include @Barmar's link, but I prefer self-contained answers) – Benoît P Aug 31 '19 at 09:57