2

I'm just trying to get myself familiar with dataclass in python. One thing I learned from some readings online is that, we can turn the regular class definition with a mutable class variable (which is a bad thing), into dataclass and that would prevent it. For example:

regular class:

class A:
    a = []
    
    def __init__(self):
        self.b = 1

this could have potential issue where different instances share the same class variable a, and modify a unknowingly.

and with dataclass:

@dataclass
class A:
    a: list = []

    def __init__(self):
        self.b = 1

this does not allow me to write this class by raising error:

ValueError: mutable default <class 'list'> for field a is not allowed: use default_factory

however, if I simply get rid of the type annotation:

@dataclass
class A:
    a = []

    def __init__(self):
        self.b = 1

there is no complaint at all and a is still shared across different instances.

Is this expected?

How come the simple type annotation would change the behavior of the class variable?

(I'm using python 3.7.6)

Axe319
  • 4,255
  • 3
  • 15
  • 31
Sam
  • 475
  • 1
  • 7
  • 19
  • 5
    Because it does. The type annotations are what trigger this check. What you wrote is not inherently wrong, it's just inadvisable. Python can't diagnose inadvisable behavior unless you are doing type checking. – Tim Roberts Nov 30 '21 at 19:00
  • 1
    What do you mean how? The `dataclasses.dataclass` decorator uses the *type annotations* of the class to generate the code for the class. IIRC an unannotated variables is just assumed to be a class variable, not an instance variable (which it will be turned into if you annotate it) – juanpa.arrivillaga Nov 30 '21 at 19:03
  • 2
    And why are you implementing `__init__`? Note, the decorator assumes your class is empty, it recognizes no instance variables, so thinsg like `__repr__` won't handle `self.b` – juanpa.arrivillaga Nov 30 '21 at 19:05
  • 1
    Also note, having a mutable class variables is pretty normal. It is *mutable default arugments* which are problematic, since they behave unexpectedly, although, if oyu understand how they work it isnt a huge problem – juanpa.arrivillaga Nov 30 '21 at 19:06
  • See https://docs.python.org/3/library/dataclasses.html#mutable-default-values – Axe319 Nov 30 '21 at 19:15
  • oh I see.., as reddy showed in the answer, it looks for fields in `__annotations__`, if not found it's just treated as a class variable. Although it does seem like one always needs to remember to put the type annotation where it's needed, which I found myself don't have this habit awalys yet... – Sam Nov 30 '21 at 19:43
  • 2
    @Sam *the whole point of dataclasses is to use annotations to avoid boilerplate*. That's the *first thing* you'd remember if you are using the `dataclasses.dataclass`. I'm not sure what you read, but you shouldn't use this decorator for things like "preventing a mutable class variable", it is a code-generator to avoid boilerplate for classes that act like "records", i.e. bundles of data ... a *data class*. Again, implementing `__init__` defeats the entire purpose – juanpa.arrivillaga Nov 30 '21 at 19:50
  • I advise you to check __post__init__ for initialisations inside dataclass : https://docs.python.org/3/library/dataclasses.html#post-init-processing – Mehmet Burak Sayıcı Jun 14 '23 at 12:11

1 Answers1

4

When you declare

@dataclass
class A:
    a = []

    def __init__(self):
        self.b = 1

a is not a dataclass field. REF: https://github.com/ericvsmith/dataclasses/issues/2#issuecomment-302987864

You can take a look at __dataclass_fields__ and __annotations__ fields after declaring the class.

In [55]: @dataclass
    ...: class A:
    ...:     a: list = field(default_factory=list)
    ...:
    ...:     def __init__(self):
    ...:         self.b = 1
    ...:

In [56]: A.__dict__
Out[56]:
mappingproxy({'__module__': '__main__',
              '__annotations__': {'a': list},
              '__init__': <function __main__.A.__init__(self)>,
              '__dict__': <attribute '__dict__' of 'A' objects>,
              '__weakref__': <attribute '__weakref__' of 'A' objects>,
              '__doc__': 'A()',
              '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False),
              '__dataclass_fields__': {'a': Field(name='a',type=<class 'list'>,default=<dataclasses._MISSING_TYPE object at 0x7f8a27ada250>,default_factory=<class 'list'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)},
              '__repr__': <function dataclasses.__repr__(self)>,
              '__eq__': <function dataclasses.__eq__(self, other)>,
              '__hash__': None})

In [57]: @dataclass
    ...: class A:
    ...:     a = []
    ...:
    ...:     def __init__(self):
    ...:         self.b = 1
    ...:

In [58]: A.__dict__
Out[58]:
mappingproxy({'__module__': '__main__',
              'a': [],
              '__init__': <function __main__.A.__init__(self)>,
              '__dict__': <attribute '__dict__' of 'A' objects>,
              '__weakref__': <attribute '__weakref__' of 'A' objects>,
              '__doc__': 'A()',
              '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False),
              '__dataclass_fields__': {},
              '__repr__': <function dataclasses.__repr__(self)>,
              '__eq__': <function dataclasses.__eq__(self, other)>,
              '__hash__': None})

From PEP 557:

The dataclass decorator examines the class to find fields. A field is defined as any variable identified in __annotations__. That is, a variable that has a type annotation. REF: How to add a dataclass field without annotating the type?

Checks only happen on dataclass fields and not on class variables, Here is the check for field which is causing the error

  if f._field_type is _FIELD and isinstance(f.default, (list, dict, set)):

Why mutable types are not allowed: https://docs.python.org/3/library/dataclasses.html#mutable-default-values

reddy nishanth
  • 396
  • 6
  • 11