181

This seems like something that is likely to have been asked before, but an hour or so of searching has yielded no results. Passing default list argument to dataclasses looked promising, but it's not quite what I'm looking for.

Here's the problem: when one tries to assign a mutable value to a class attribute, there's an error:

@dataclass
class Foo:
    bar: list = []

# ValueError: mutable default <class 'list'> for field a is not allowed: use default_factory

I gathered from the error message that I'm supposed to use the following instead:

from dataclasses import field

@dataclass
class Foo:
    bar: list = field(default_factory=list)

But why are mutable defaults not allowed? Is it to enforce avoidance of the mutable default argument problem?

colllin
  • 9,442
  • 9
  • 49
  • 65
Graham
  • 3,153
  • 3
  • 16
  • 31
  • 14
    "Is it to enforce avoidance of the mutable default argument problem" Yes. Imagine a change to one instance changing all of instances ever created. If this is one's desired behavior they should use a class attribute. – DeepSpace Dec 05 '18 at 12:21
  • 2
    [Relevant section of PEP 557](https://www.python.org/dev/peps/pep-0557/#mutable-default-values) explaining this design. – shmee Dec 05 '18 at 13:08
  • 4
    Your question answered my question, clearly you are smarter than me. Take this upvote! – Mr. Developerdude Dec 11 '19 at 10:27
  • I think this https://youtrack.jetbrains.com/issue/PY-42319 – Tyomik_mnemonic Feb 11 '21 at 09:04
  • 12
    As I've still managed to miss this solution in the question, I'll copy the proper syntax here: `bar: list = dataclasses.field(default_factory=list)` – Nickolay Jun 19 '21 at 13:28

5 Answers5

137

It looks like my question was quite clearly answered in the docs (which derived from PEP 557, as shmee mentioned):

Python stores default member variable values in class attributes. Consider this example, not using dataclasses:

class C:
    x = []
    def add(self, element):
        self.x.append(element)

o1 = C()
o2 = C()
o1.add(1)
o2.add(2)
assert o1.x == [1, 2]
assert o1.x is o2.x

Note that the two instances of class C share the same class variable x, as expected.

Using dataclasses, if this code was valid:

@dataclass
class D:
    x: List = []
    def add(self, element):
        self.x += element

it would generate code similar to:

class D:
    x = []
    def __init__(self, x=x):
        self.x = x
    def add(self, element):
        self.x += element

This has the same issue as the original example using class C. That is, two instances of class D that do not specify a value for x when creating a class instance will share the same copy of x. Because dataclasses just use normal Python class creation they also share this behavior. There is no general way for Data Classes to detect this condition. Instead, dataclasses will raise a ValueError if it detects a default parameter of type list, dict, or set. This is a partial solution, but it does protect against many common errors.

Renato Byrro
  • 3,578
  • 19
  • 34
Graham
  • 3,153
  • 3
  • 16
  • 31
22

The above answer is not correct. A mutable default value, such as an empty list can be defined in data class by using default_factory.

    @dataclass
    class D:
        x: list = field(default_factory=list) 

Using default factory functions is a way to create new instances of >mutable types as default values for fields:

   @dataclass
   class D:
       x: list = field(default_factory=list)

   assert D().x is not D().x

The link is here

Shizzy
  • 339
  • 2
  • 4
  • 1
    There is nothing wrong with the above answer from what I can tell, though I will agree that it appears to be somewhat incomplete, as mentioned. The important part is highlighted here: *"Using dataclasses, **if** this code was valid [...]"* – rv.kvetch Nov 02 '22 at 17:03
  • @rv.kvetch I cannot fully agree, beacuse if you chekc their *assert* condition in this two sections, it's clear that using the *default_factory* willl not give you the problem it's trying to avoid.` ``` @dataclass class D: x: list = field(default_factory=list) **assert D().x is not D().x** ``` – Shizzy Nov 04 '22 at 09:37
  • 7
    You should write "@username’s answer" rather than "The above answer" because the order in which answers are displayed can change over time. Right now your answer is just below one that was written three weeks after yours. – bfontaine Jan 17 '23 at 16:53
8

Just use a callable in your default_factory:

from dataclasses import dataclass, field

@dataclass
class SomeClass:
    """
    """

    some_list: list = field(default_factory=lambda: ["your_values"])

If you want all instances to mutate the same list:

from dataclasses import dataclass, field

SHARED_LIST = ["your_values"]
    
@dataclass
class SomeClass:
    """
    """
    
    some_list: list = field(default_factory=lambda: SHARED_LIST)
Metalstorm
  • 2,940
  • 3
  • 26
  • 22
6

import field like dataclass.

from dataclasses import dataclass, field

and use this for lists:

@dataclass
class Foo:
    bar: list = field(default_factory=list)
Sadegh Pouriyan
  • 157
  • 1
  • 10
  • 2
    The question is not about what to write instead. The question is about why it is not permitted to do things in the obvious way. – Karl Knechtel Jan 04 '23 at 00:18
1

I stumbled across this issue because I do want to have a static list as class variable. This can be done using the ClassVar annotation:

from typing import ClassVar

@dataclass
class Foo:
    bar: ClassVar[list[str]] = ['hello', 'world']
NicoHood
  • 687
  • 5
  • 12