6

I generate two different instances of a python dataclass which includes a nested dataclass. When I update a value in the nested dataclass in one instance (but not in the other), the same data is placed in the nested dataclass in both instances. This is not what I expected.

from dataclasses import dataclass


@dataclass
class sub1:
    q: int = 10
    r: str = "qrst"


@dataclass
class A:
    a: int = 1
    s1: sub1 = sub1()


if __name__ == '__main__':
    a = A()
    aa = A()
    aa.a = 9
    aa.s1.r = "92"
    print("a:", repr(a))
    print("aa:", repr(aa))

''' Produces --
a: A(a=1, s1=sub1(q=10, r='92'))
aa: A(a=9, s1=sub1(q=10, r='92'))
'''

I expected the nested dataclass to be updated in only the specified instance (aa) and for the nested dataclass in the other instance (a) to remain unchanged.

What am I doing wrong, or is dataclass the wrong tool?

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
Den
  • 197
  • 1
  • 10
  • This has nothing to do with dataclasses, this would work the same way if you did `s1: sub1 = []` This is how default values *always work*. – juanpa.arrivillaga Sep 26 '19 at 23:27
  • Also note, this isn't "nesting". You are simply using an object as an attribute of another object, *composition* would be the jargon. Also, please always use the generic [python] tag for all python related questions. – juanpa.arrivillaga Sep 26 '19 at 23:27
  • This is essentially a duplicate of this: https://stackoverflow.com/questions/1680528/how-to-avoid-having-class-data-shared-among-instances although, perhaps the involvement of data-classes makes it worthy of it's own question, but the answer/root cause is essentially the same. – juanpa.arrivillaga Sep 26 '19 at 23:33
  • This is not a generic python question as it is python 3 that has dataclasses. – Den Sep 28 '19 at 02:17
  • it *is a Python question* so it should be tagged with the generic tag. Use a version-specific tag at your discretion – juanpa.arrivillaga Sep 28 '19 at 02:19
  • juanpa. I've thought about this a bit more, and you're right. I'll be careful about this in the future. – Den Sep 30 '19 at 01:16

1 Answers1

12

What you are currently doing is providing a default value for the field. As that value is a mutable object, changes to that object will be visible to all instances of your dataclass.

What you should do instead is provide a default factory that produces sub1 instances for each new A instance:

from dataclasses import field

@dataclass
class A:
    a: int = 1
    s1: sub1 = field(default_factory=sub1)

a = A()
aa = A()
aa.a = 9
aa.s1.r = "92"
print("a:", repr(a))  # a: A(a=1, s1=sub1(q=10, r='qrst'))
print("aa:", repr(aa))  # aa: A(a=9, s1=sub1(q=10, r='92'))
Patrick Haugh
  • 59,226
  • 13
  • 88
  • 96