136

It is a common mistake in Python to set a mutable object as the default value of an argument in a function. Here's an example taken from this excellent write-up by David Goodger:

>>> def bad_append(new_item, a_list=[]):
        a_list.append(new_item)
        return a_list
>>> print bad_append('one')
['one']
>>> print bad_append('two')
['one', 'two']

The explanation why this happens is here.

And now for my question: Is there a good use-case for this syntax?

I mean, if everybody who encounters it makes the same mistake, debugs it, understands the issue and from thereon tries to avoid it, what use is there for such syntax?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Jonathan Livni
  • 101,334
  • 104
  • 266
  • 359
  • See http://stackoverflow.com/questions/1132941/least-astonishment-in-python-the-mutable-default-argument – Sam Dolan Feb 06 '12 at 09:55
  • 1
    The best explanation I know for this is in the linked question: functions are first-class objects, just like classes. Classes have mutable attribute data; functions have mutable default values. – Katriel Feb 06 '12 at 10:23
  • http://stackoverflow.com/questions/2639915/why-the-mutable-default-argument-fix-syntax-is-so-ugly-asks-python-newbie – Katriel Feb 06 '12 at 10:27
  • 18
    This behavior it is not a "design choice" - it is a result from the way the language works - starting from simple working principles, with as few exceptions as possible. At some point for me, as I started to "think in Python" this behavior just became natural - and I'd be surprised if it did not happen – jsbueno Feb 06 '12 at 18:33
  • 3
    I've wondered this too. This example is all over the web, but it just doesn't make sense - either you want to mutate the passed list and having a default doesn't make sense, or you want to return a new list and you should make a copy immediately upon entering the function. I can't imagine the case where it's useful to do both. – Mark Ransom Jul 10 '12 at 15:22
  • 1
    FWIW I use them in [my answer](http://stackoverflow.com/questions/4103773/efficient-way-of-having-a-function-only-execute-once-in-a-loop/4115934#4115934) to the question [_Efficient way of having a function only execute once in a loop_](http://stackoverflow.com/questions/4103773/efficient-way-of-having-a-function-only-execute-once-in-a-loop). – martineau Dec 28 '14 at 17:19
  • 3
    I just came across a more realistic example that doesn't have the problem I complain about above. The default is an argument to the `__init__` function for a class, which gets set into an instance variable; this is a perfectly valid thing to want to do, and it all goes horribly wrong with a mutable default. http://stackoverflow.com/questions/43768055/python-class-instance-variable-isolation – Mark Ransom May 03 '17 at 19:42
  • 5 years later : Okay. It seems that the only good use for this bug is memoization, which can be done with `@functools.lru_cache` anyway. Sigh... – Eric Duminil Oct 15 '17 at 19:36
  • @MarkRansom: Good example. It looks correct and simple enough, it works in many other languages but fails miserably in Python. – Eric Duminil Oct 15 '17 at 19:43
  • @EricDuminil it's not a bug in Python, it's just something that requires a deeper understanding of the internals than most people are willing to accommodate. It makes perfect sense once you've grokked those internals, but it trips up the naive every day - it's completely non-intuitive. – Mark Ransom Oct 16 '17 at 02:24
  • 3
    @MarkRansom: With your definition, there wouldn't be any bug ever on a (deterministic) computer. Every bug makes sense when you spend enough time grokking the internals. Let's be honest and call this behaviour one of the very few design flaws in Python. – Eric Duminil Oct 16 '17 at 06:30

8 Answers8

91

You can use it to cache values between function calls:

def get_from_cache(name, cache={}):
    if name in cache: return cache[name]
    cache[name] = result = expensive_calculation()
    return result

but usually that sort of thing is done better with a class as you can then have additional attributes to clear the cache etc.

Chen A.
  • 10,140
  • 3
  • 42
  • 61
Duncan
  • 92,073
  • 11
  • 122
  • 156
21

Canonical answer is this page: http://effbot.org/zone/default-values.htm

It also mentions 3 "good" use cases for mutable default argument:

  • binding local variable to current value of outer variable in a callback
  • cache/memoization
  • local rebinding of global names (for highly optimized code)
Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
  • 2
    It seems like "binding local variable to current value of outer variable in a callback" is just a workaround for another design flaw in Python. – emclain Dec 08 '21 at 20:01
16

Maybe you do not mutate the mutable argument, but do expect a mutable argument:

def foo(x, y, config={}):
    my_config = {'debug': True, 'verbose': False}
    my_config.update(config)
    return bar(x, my_config) + baz(y, my_config)

(Yes, I know you can use config=() in this particular case, but I find that less clear and less general.)

Reinstate Monica
  • 4,568
  • 1
  • 24
  • 35
  • 11
    Also make sure that you **do not mutate** and **do not return** this default value directly from the function, otherwise some code outside the function can mutate it and it will affect all function calls. – Andrey Semakin Dec 21 '19 at 05:03
11
import random

def ten_random_numbers(rng=random):
    return [rng.random() for i in xrange(10)]

Uses the random module, effectively a mutable singleton, as its default random number generator.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 7
    But this isn't a terribly important use case either. – Evgeni Sergeev Jan 18 '14 at 07:02
  • 4
    I think there is no difference in behavior, between Python's "obtain reference once" and not-Python's "lookup `random` once per function call". Both end up using the same object. – nyanpasu64 Nov 18 '18 at 02:08
  • `random` isn't mutable though. `import random` then `print(hash(random))`. Modules, classes (`type`s, not instances), and functions are considered immutable. This is how a lot of memoization and dependency injection machinery works. – DeusXMachina Mar 09 '22 at 20:54
  • Nota bene: "mutable" in python has some very specific connotations (technically everything is "mutable", it's python). Maybe "frozen" is a better term. Anything which obeys a `Hashable` (`__hash__`/`__eq__`) interface can be considered frozen. Just because an object has side effects does not make it mutable: `socket.socket` is another example of a first-class hashable object with side effects. – DeusXMachina Mar 09 '22 at 21:12
7

I know this is an old one, but just for the heck of it I'd like to add a use case to this thread. I regularly write custom functions and layers for TensorFlow/Keras, upload my scripts to a server, train the models (with custom objects) there, and then save the models and download them. In order to load those models I then need to provide a dictionary containing all of those custom objects.

What you can do in situations like mine is add some code to the module containing those custom objects:

custom_objects = {}

def custom_object(obj, storage=custom_objects):
    storage[obj.__name__] = obj
    return obj

Then, I can just decorate any class/funcion that needs to be in the dictionary

@custom_object
def some_function(x):
    return 3*x*x + 2*x - 2

Moreover, say I want to store my custom loss funcions in a different dictionary than my custom Keras layers. Using functools.partial gives me easy access to a new decorator

import functools
import tf

custom_losses = {}
custom_loss = functools.partial(custom_object, storage=custom_losses)

@custom_loss
def my_loss(y, y_pred):
    return tf.reduce_mean(tf.square(y - y_pred))
simon
  • 796
  • 1
  • 6
  • 11
3

EDIT (clarification): The mutable default argument issue is a symptom of a deeper design choice, namely, that default argument values are stored as attributes on the function object. You might ask why this choice was made; as always, such questions are difficult to answer properly. But it certainly has good uses:

Optimising for performance:

def foo(sin=math.sin): ...

Grabbing object values in a closure instead of the variable.

callbacks = []
for i in range(10):
    def callback(i=i): ...
    callbacks.append(callback)
Katriel
  • 120,462
  • 19
  • 136
  • 170
  • 10
    integers and builtin functions are not mutable! – Reinstate Monica Feb 06 '12 at 10:42
  • 2
    @Jonathan: There's still no mutable default argument in the remaining example, or do I just not see it? – Reinstate Monica Feb 06 '12 at 12:31
  • 2
    @Jonathan: my point is not that these are mutable. It's that the system Python uses to store default arguments -- on the function object, defined at compile-time -- can be useful. This implies the mutable default argument issue, since re-evaluating the argument on each function call will render the trick useless. – Katriel Feb 06 '12 at 13:09
  • 2
    @katriealex: OK, but please say so in your answer that you assume that arguments would have to be re-evaluated, and that you show why that would be bad. Nit-pick: default argument values are not stored at compile-time, but when the function definition statement is executed. – Reinstate Monica Feb 06 '12 at 13:19
  • @WolframH: true :P! Although the two often coincide. – Katriel Feb 06 '12 at 15:01
0

A mutable default argument, that is not ever actually used by calling code, can be used to create a sentinel value. The built-in Python deep copy does this.

A mutable argument is used to ensure that the value is unique to that function: since a new list must be created when deepcopy is compiled, and it is otherwise inaccessible, the object cannot appear anywhere else. Immutable objects tend to get interned, and an empty list is easy to create. Normally, sentinel objects like this would be explicitly created separately, but this way avoids namespace pollution (even with leading-underscore names), I suppose.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
-2

In answer to the question of good uses for mutable default argument values, I offer the following example:

A mutable default can be useful for programing easy to use, importable commands of your own creation. The mutable default method amount to having private, static variables in a function that you can initialization on the first call (very much like a class) but without having to resort to globals, without having to use a wrapper, and without having to instantize a class object that was imported. It is in its own way elegant, as I hope you will agree.

Consider these two examples:

def dittle(cache = []):

    from time import sleep # Not needed except as an example.

    # dittle's internal cache list has this format: cache[string, counter]
    # Any argument passed to dittle() that violates this format is invalid.
    # (The string is pure storage, but the counter is used by dittle.)

     # -- Error Trap --
    if type(cache) != list or cache !=[] and (len(cache) == 2 and type(cache[1]) != int):
        print(" User called dittle("+repr(cache)+").\n >> Warning: dittle() takes no arguments, so this call is ignored.\n")
        return

    # -- Initialize Function. (Executes on first call only.) --
    if not cache:
        print("\n cache =",cache)
        print(" Initializing private mutable static cache. Runs only on First Call!")
        cache.append("Hello World!")
        cache.append(0)
        print(" cache =",cache,end="\n\n")
    # -- Normal Operation --
    cache[1]+=1 # Static cycle count.
    outstr = " dittle() called "+str(cache[1])+" times."
    if cache[1] == 1:outstr=outstr.replace("s.",".")
    print(outstr)
    print(" Internal cache held string = '"+cache[0]+"'")
    print()
    if cache[1] == 3:
        print(" Let's rest for a moment.")
        sleep(2.0) # Since we imported it, we might as well use it.
        print(" Wheew! Ready to continue.\n")
        sleep(1.0)
    elif cache[1] == 4:
        cache[0] = "It's Good to be Alive!" # Let's change the private message.

# =================== MAIN ======================        
if __name__ == "__main__":

    for cnt in range(2):dittle() # Calls can be loop-driven, but they need not be.

    print(" Attempting to pass an list to dittle()")
    dittle([" BAD","Data"])
    
    print(" Attempting to pass a non-list to dittle()")
    dittle("hi")
    
    print(" Calling dittle() normally..")
    dittle()
    
    print(" Attempting to set the private mutable value from the outside.")
    # Even an insider's attempt to feed a valid format will be accepted
    # for the one call only, and is then is discarded when it goes out
    # of scope. It fails to interrupt normal operation.
    dittle([" I am a Grieffer!\n (Notice this change will not stick!)",-7]) 
    
    print(" Calling dittle() normally once again.")
    dittle()
    dittle()

If you run this code, you will see that the dittle() function internalizes on the the very first call but not on additional calls, it uses a private static cache (the mutable default) for internal static storage between calls, rejects attempts to hijack the static storage, is resilient to malicious input, and can act based on dynamic conditions (here on the number of times the function has been called.)

The key to using mutable defaults not to do anything what will reassign the variable in memory, but to always change the variable in place.

To really see the potential power and usefulness of this technique, save this first program to your current directory under the name "DITTLE.py", then run the next program. It imports and uses our new dittle() command without requiring any steps to remember or programing hoops to jump through.

Here is our second example. Compile and run this as a new program.

from DITTLE import dittle

print("\n We have emulated a new python command with 'dittle()'.\n")
# Nothing to declare, nothing to instantize, nothing to remember.

dittle()
dittle()
dittle()
dittle()
dittle()

Now isn't that as slick and clean as can be? These mutable defaults can really come in handy.

========================

After reflecting on my answer for a while, I'm not sure that I made the difference between using the mutable default method and the regular way of accomplishing the same thing clear.

The regular way is to use an importable function that wraps a Class object (and uses a global). So for comparison, here a Class-based method that attempts to do the same things as the mutable default method.

from time import sleep

class dittle_class():

    def __init__(self):
        
        self.b = 0
        self.a = " Hello World!"
        
        print("\n Initializing Class Object. Executes on First Call only.")
        print(" self.a = '"+str(self.a),"', self.b =",self.b,end="\n\n")
    
    def report(self):
        self.b  = self.b + 1
        
        if self.b == 1:
            print(" Dittle() called",self.b,"time.")
        else:
            print(" Dittle() called",self.b,"times.")
        
        if self.b == 5:
            self.a = " It's Great to be alive!"
        
        print(" Internal String =",self.a,end="\n\n")
            
        if self.b ==3:
            print(" Let's rest for a moment.")
            sleep(2.0) # Since we imported it, we might as well use it.
            print(" Wheew! Ready to continue.\n")
            sleep(1.0)

cl= dittle_class()

def dittle():
    global cl
    
    if type(cl.a) != str and type(cl.b) != int:
        print(" Class exists but does not have valid format.")
        
    cl.report()

# =================== MAIN ====================== 
if __name__ == "__main__":
    print(" We have emulated a python command with our own 'dittle()' command.\n")
    for cnt in range(2):dittle() # Call can be loop-driver, but they need not be.
    
    print(" Attempting to pass arguments to dittle()")
    try: # The user must catch the fatal error. The mutable default user did not. 
        dittle(["BAD","Data"])
    except:
        print(" This caused a fatal error that can't be caught in the function.\n")
    
    print(" Calling dittle() normally..")
    dittle()
    
    print(" Attempting to set the Class variable from the outside.")
    cl.a = " I'm a griefer. My damage sticks."
    cl.b = -7
    
    dittle()
    dittle()

Save this Class-based program in your current directory as DITTLE.py then run the following code (which is the same as earlier.)

from DITTLE import dittle
# Nothing to declare, nothing to instantize, nothing to remember.

dittle()
dittle()
dittle()
dittle()
dittle()

By comparing the two methods, the advantages of using a mutable default in a function should be clearer. The mutable default method needs no globals, it's internal variables can't be set directly. And while the mutable method accepted a knowledgeable passed argument for a single cycle then shrugged it off, the Class method was permanently altered because its internal variable are directly exposed to the outside. As for which method is easier to program? I think that depends on your comfort level with the methods and the complexity of your goals.

user10637953
  • 370
  • 1
  • 10
  • I don't know why you need `global` at all in the second example. Despite that, I think the second example is much more readable than the first example. Even if functionally the end result is the same, using `class` signals to the reader, "I have some state I want to keep together". But, you did answer the question, so I give you props for that. I'd actually say this is a good counter-example of why actually using mutable parameters is almost always a bad idea. – DeusXMachina Mar 09 '22 at 21:01