136

I'm trying to create a frozen dataclass but I'm having issues with setting a value from __post_init__. Is there a way to set a field value based on values from an init param in a dataclass when using the frozen=True setting?

RANKS = '2,3,4,5,6,7,8,9,10,J,Q,K,A'.split(',')
SUITS = 'H,D,C,S'.split(',')


@dataclass(order=True, frozen=True)
class Card:
    rank: str = field(compare=False)
    suit: str = field(compare=False)
    value: int = field(init=False)
    def __post_init__(self):
        self.value = RANKS.index(self.rank) + 1
    def __add__(self, other):
        if isinstance(other, Card):
            return self.value + other.value
        return self.value + other
    def __str__(self):
        return f'{self.rank} of {self.suit}'

and this is the trace

 File "C:/Users/user/.PyCharm2018.3/config/scratches/scratch_5.py", line 17, in __post_init__
    self.value = RANKS.index(self.rank) + 1
  File "<string>", line 3, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'value'
nicholishen
  • 2,602
  • 2
  • 9
  • 13

7 Answers7

137

Use the same thing the generated __init__ method does: object.__setattr__.

def __post_init__(self):
    object.__setattr__(self, 'value', RANKS.index(self.rank) + 1)
user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 68
    This works. However, it does seem that the `dataclass` generated `__setattr__` should know to not raise `FrozenInstanceError` when being called from `__post_init__` on a name that has `init=False`. Using `object.__setattr__` like this is ugly / tedious. – John B Jan 20 '19 at 23:25
  • 19
    `super().__setattr__('attr_name', value)` seems cleaner to me. And should works as long as the dataclass do not inherit from another frozen dataclass – Conchylicultor Dec 13 '19 at 02:22
  • 5
    @Conchylicultor I agree, but this answer simply follows the [documentation](https://docs.python.org/3/library/dataclasses.html#frozen-instances): `There is a tiny performance penalty when using frozen=True: __init__() cannot use simple assignment to initialize fields, and must use object.__setattr__().` – xuhdev Nov 12 '20 at 22:54
  • 1
    As a person with Java background I'm totally confused. Isn't `object` the base `class` that all instance classes inherit from (extend) it? Then what exactly does it mean to `object.__setattr__`? Can I use `object.__dict__['attr_name'] = value` then? – Alireza Mohamadi Jan 04 '22 at 10:38
  • 2
    @AlirezaMohamadi: Unlike Java, Python lets you explicitly call method implementations from specific classes. Here, we call the `__setattr__` implementation from `object`, bypassing the override in the frozen dataclass. – user2357112 Jan 04 '22 at 10:46
  • 2
    `object.__dict__['attr_name'] = value` is wrong because it would try to set an entry in `object`'s `__dict__` instead of `self`. `self.__dict__['attr_name'] = value` would work for most attributes, but it would fail with anything that needs to go through a [descriptor](https://docs.python.org/3/reference/datamodel.html#implementing-descriptors), such as attributes that use `__slots__` (if `dataclasses` ever adds `__slots__` support). – user2357112 Jan 04 '22 at 10:47
  • 2
    > (if dataclasses ever adds __slots__ support). @user2357112supportsMonica slots support was added in Python 3.10 some months ago : https://docs.python.org/3/whatsnew/3.10.html#dataclasses (along with the kw_only arg that makes dataclass inheritance usable at last) – florianlh Feb 03 '22 at 10:15
  • 3
    @xuhdev "this answer simply follows the documentation": not really if we want to be precies, rather, this answer follows what the generated `__init__` does (which is mentioned in the documentation). It's still valuable to note that the docs don't suggest a solution for the OPs question. – Carl Oct 16 '22 at 11:21
12

A solution I use in almost all of my classes is to define additional constructors as classmethods.

Based on the given example, one could rewrite it as follows:

@dataclass(order=True, frozen=True)
class Card:
    rank: str = field(compare=False)
    suit: str = field(compare=False)
    value: int

    def __post_init__(self) -> None:
        if not is_valid_rank(self.rank):
            raise ValueError(f"Rank {self.rank} of Card is invalid!")

    @classmethod
    def from_rank_and_suite(cls, rank: str, suit: str) -> "Card":
        value = RANKS.index(self.rank) + 1
        return cls(rank=rank, suit=suit, value=value)

By this one has all the freedom one requires without having to resort to __setattr__ hacks and without having to give up desired strictness like frozen=True.

Max Görner
  • 674
  • 7
  • 16
  • 3
    I like this approach! But one advantage of using `__post_init__` is to guarantee invariants. With the `@classmethod`/`@staticmethod` approach, people using the dataclass can still construct it directly. – jli May 21 '22 at 23:05
  • 3
    You are right, important invariants are not enforced. However, if they are important, one still could implement a `__post_init__` to ensure these invariants. – Max Görner May 22 '22 at 14:48
  • I combined both. Though right now it occurred to me to write a frozen dataclass with an `__init__` clearer and more flexible than the default one, and now I need the hack again. :( – arseniiv Nov 17 '22 at 19:05
  • 1
    New to Python here. Any reason you use @classmethod over @staticmethod? – jmrah Mar 10 '23 at 22:40
  • 1
    This is very good question. I must admit, that there are no no hard reasons for me to prefer the one over the other. As a rule of thumb, I try to avoid `@staticmethod` because these can become normal functions as well. Here I use `@classmethod`, because it gives me access to `cls`, so I do not need to duplicate the class' name in the function body. Finally it's a question of style and taste, doing it differently would work just as well. – Max Görner Mar 12 '23 at 09:22
6

Using mutation

Frozen objects should not be changed. But once in a while the need may arise. The accepted answer works perfectly for that. Here is another way of approaching this: return a new instance with the changed values. This may be overkill for some cases, but it's an option.

from copy import deepcopy

@dataclass(frozen=True)
class A:
    a: str = ''
    b: int = 0

    def mutate(self, **options):
        new_config = deepcopy(self.__dict__)
        # some validation here
        new_config.update(options)
        return self.__class__(**new_config)

Another approach

If you want to set all or many of the values, you can call __init__ again inside __post_init__. Though there are not many use cases.

The following example is not practical, only for demonstrating the possibility.

from dataclasses import dataclass, InitVar


@dataclass(frozen=True)
class A:
    a: str = ''
    b: int = 0
    config: InitVar[dict] = None

    def __post_init__(self, config: dict):
        if config:
            self.__init__(**config)

The following call

A(config={'a':'a', 'b':1})

will yield

A(a='a', b=1)

without throwing error. This is tested on python 3.7 and 3.9.

Of course, you can directly construct using A(a='hi', b=1), but there maybe other uses, e.g. loading configs from a json file.

Bonus: an even crazier usage

A(config={'a':'a', 'b':1, 'config':{'a':'b'}})

will yield

A(a='b', b=1)
Tim
  • 3,178
  • 1
  • 13
  • 26
  • 1
    this solution gives us an interesting level of flexibility! the only problem with this is that when we let this implementation be merged to our code base we are also giving "tools and approaches" that will ease the life of the "bad practice engineers". now they have an example in the code base of something that breaks what i like to call `semantic consistency` and will use it to defend its smelly implementations. "semantic consistency" is always a good way of empowering good practices, but we are not doing this when calling `__init__` inside `__post_init__` :'( – gbrennon Dec 10 '21 at 05:21
  • 1
    @gbrennon great observation! I completely agree. I've added another approach :) – Tim Dec 10 '21 at 21:01
  • 6
    [dataclasses.replace](https://docs.python.org/3/library/dataclasses.html#dataclasses.replace) is intended for this purpose. – Hymns For Disco Dec 23 '21 at 00:29
  • @HymnsForDisco nice! But it seems to require specifying init-only variables without default values. – Tim Dec 23 '21 at 01:26
5

Solution avoiding object mutation using cached property

This is a simplified Version of @Anna Giasson answer.

Frozen dataclasses work well together with caching from the functools module. Instead of using a dataclass field, you can define a @functools.cached_property annotated method that gets evaluated only upon the first lookup of the attribute. Here is a minimal version of the original example:

from dataclasses import dataclass
import functools

@dataclass(frozen=True)
class Card:
    rank: str

    @functools.cached_property
    def value(self):
        # just for demonstration:
        # this gets printed only once per Card instance
        print("Evaluate value")
        return len(self.rank)

card = Card(rank="foo")

assert card.value == "foo"
assert card.value == "foo"

In practice, if the evaluation is cheap, you can also use a non-cached @property decorator.

Peter Barmettler
  • 389
  • 2
  • 10
  • 1
    This seems like a way better aligning with immutable semantics, while also not introducing low-level irregular python hacks or idioms. – matanster Jun 29 '23 at 14:23
1

This feels a little bit like 'hacking' the intent of a frozen dataclass, but works well and is clean for making modifications to a frozen dataclass within the post_init method. Note that this decorator could be used for any method (which feels scary, given that you expect the dataclass to be frozen), thus I compensated by asserting the function name this decorator attaches to must be 'post_init'.

Separate from the class, write a decorator that you'll use in the class:

def _defrost(cls):
    cls.stash_setattr = cls.__setattr__
    cls.stash_delattr = cls.__delattr__
    cls.__setattr__ = object.__setattr__
    cls.__delattr__ = object.__delattr__

def _refreeze(cls):
    cls.__setattr__ = cls.stash_setattr
    cls.__delattr__ = cls.stash_delattr
    del cls.stash_setattr
    del cls.stash_delattr

def temp_unfreeze_for_postinit(func):
    assert func.__name__ == '__post_init__'
    def wrapper(self, *args, **kwargs):
        _defrost(self.__class__)
        func(self, *args, **kwargs)
        _refreeze(self.__class__)
    return wrapper

Then, within your frozen dataclass, simply decorate your post_init method!

@dataclasses.dataclass(frozen=True)
class SimpleClass:
    a: int

    @temp_unfreeze_for_postinit
    def __post_init__(self, adder):
        self.b = self.a + adder
bayesIan
  • 11
  • 1
  • 2
    This is a bad idea - doing this means it's unsafe to construct instances of your class from different threads at the same time. – user2357112 Jul 19 '22 at 08:14
1

Commenting with my own solution as I stumbled upon this with the same question but found none of the solutions suited my application.

Here the property that, much like OP, I tried to create in a post_init method initially is the bit_mask property.

I got it to work the cached_property decorator in functools; since I wanted the property to be static/immutable much like the other properties in the dataclass.

The function create_bitmask is defined elsewhere in my code, but you can see that it depends on the other properties of the dataclass instantance.

Hopefully, someone else might find this helpful.

from dataclasses import dataclass
from functools import cached_property

@dataclass(frozen=True)
class Register:
    subsection: str
    name: str
    abbreviation: str
    address: int
    n_bits: int
    _get_method: Callable[[int], int]
    _set_method: Callable[[int, int], None]
    _save_method: Callable[[int, int], None]

    @cached_property
    def bit_mask(self) -> int:
        # The cache is used to avoid recalculating since this is a static value
        # (hence max_size = 1)
        return create_bitmask(
            n_bits=self.n_bits,
            start_bit=0,
            size=self.n_bits,
            set_val=True
            )

    def get(self) -> int:
        raw_value = self._get_method(self.address)
        return raw_value & self.bit_mask

    def set(self, value: int) -> None:
        self._set_method(
            self.address,
            value & self.bit_mask
            )

    def save(self, value: int) -> None:
        self._save_method(
            self.address,
            value & self.bit_mask
            )
  • 3
    what is the benefit to using @property with `@lru_cache(maxsize=1)`, i.e. over [`cached_property`](https://docs.python.org/3/library/functools.html#functools.cached_property)? – rv.kvetch Sep 21 '22 at 21:03
  • 2
    I suppose there is none, I simply didn't know of the existence of cached_property. Thanks for pointing me toward that. That would definitely simplify this more, plus that appears to be the intended use case. – Anna Giasson Oct 31 '22 at 17:37
0

Avoiding mutation as proposed by Peter Barmettler is what I tend to do in such cases. It feels much more consistent with the frozen=True feature. As a side note, order=True and the __add__ method made me think you would like to sort and compute a score based on a list of cards.

This might be a possible approach:

from __future__ import annotations
from dataclasses import dataclass

RANKS = '2,3,4,5,6,7,8,9,10,J,Q,K,A'.split(',')
SUITS = 'H,D,C,S'.split(',')


@dataclass(frozen=True)
class Card:
    rank: str
    suit: str

    @property
    def value(self) -> int:
        return RANKS.index(self.rank) + 1

    def __lt__(self, __o: Card) -> bool:
        return self.value < __o.value

    def __str__(self) -> str:
        return f'{self.rank} of {self.suit}'

    @classmethod
    def score(cls, cards: list[Card]) -> int: 
        return sum(card.value for card in cards)


c1 = Card('A', 'H')
c2 = Card('3', 'D')

cards = [c1, c2]

Card.score(cards) # -> 15
sorted(cards) # -> [Card(rank='3', suit='D'), Card(rank='A', suit='H')]

The scoring logic does not need to be a class method, but this feels ok since the logic determining the value of a card is inside the class as well.

raffaele
  • 1
  • 1