35

I'm trying to build a @dataclass that defines a schema but is not actually instantiated with the given members. (Basically, I'm hijacking the convenient @dataclass syntax for other purposes). This almost does what I want:

@dataclass(frozen=True, init=False)
class Tricky:
    thing1: int
    thing2: str

    def __init__(self, thing3):
        self.thing3 = thing3

But I get a FrozenInstanceError in the __init__ method:

dataclasses.FrozenInstanceError: cannot assign to field 'thing3'

I need the frozen=True (for hashability). Is there some way I can set a custom attribute in __init__ on a frozen @dataclass?

Sasgorilla
  • 2,403
  • 2
  • 29
  • 56
  • 1
    "(Basically, I'm hijacking the convenient @dataclass syntax for other purposes)" Um, just don't do that? Or just don't use frozen and implement your own `__hash__`, seeing as you aren't really using a dataclass... – juanpa.arrivillaga Sep 11 '19 at 17:13
  • from where does `self.thing3` came up? – vb_rises Sep 11 '19 at 17:14
  • 1
    What *are* you using the syntax for then? Because the `@dataclass` syntax is not even dataclass-specific, it is just using standard annotations and type hinting. What problem are you solving by using adopting dataclasses? – Martijn Pieters Sep 11 '19 at 17:23
  • @juanpa.arrivillaga: or just use `unsafe_hash=True` instead of `frozen=True`. – Martijn Pieters Sep 11 '19 at 17:33

5 Answers5

24

The problem is that the default __init__ implementation uses object.__setattr__() with frozen classes and by providing your own implementation, you have to use it too which would make your code pretty hacky:

@dataclass(frozen=True, init=False)
class Tricky:
    thing1: int
    thing2: str

    def __init__(self, thing3):
        object.__setattr__(self, "thing3", thing3)

Unfortunately, python does not provide a way to use the default implementation so we can't simply do something like:

@dataclass(frozen=True, init=False)
class Tricky:
    thing1: int
    thing2: str

    def __init__(self, thing3, **kwargs):
        self.__default_init__(DoSomething(thing3), **kwargs)

However, with we can implement that behavior quite easily:

def dataclass_with_default_init(_cls=None, *args, **kwargs):
    def wrap(cls):
        # Save the current __init__ and remove it so dataclass will
        # create the default __init__.
        user_init = getattr(cls, "__init__")
        delattr(cls, "__init__")

        # let dataclass process our class.
        result = dataclass(cls, *args, **kwargs)

        # Restore the user's __init__ save the default init to __default_init__.
        setattr(result, "__default_init__", result.__init__)
        setattr(result, "__init__", user_init)

        # Just in case that dataclass will return a new instance,
        # (currently, does not happen), restore cls's __init__.
        if result is not cls:
            setattr(cls, "__init__", user_init)

        return result

    # Support both dataclass_with_default_init() and dataclass_with_default_init
    if _cls is None:
        return wrap
    else:
        return wrap(_cls)

and then

@dataclass_with_default_init(frozen=True)
class DataClass:
    value: int

    def __init__(self, value: str):
        # error:
        # self.value = int(value)

        self.__default_init__(value=int(value))

Update: I opened this bug and I hope to implement that by 3.9.

Shmuel H.
  • 2,348
  • 1
  • 16
  • 29
  • 2
    The linked issue was unfortunately closed as "won't fix". – rudolfbyker Jun 28 '21 at 13:15
  • 3
    Hi, why do you indicate your solution as 'pretty hacky'? (`object.__setattr__(self, "thing3", thing3)`) It works fine and is compact. Is calling 'object' a bad practise? – pierre_j Dec 01 '21 at 13:21
12

I need the frozen=True (for hashability).

There is no strict need to freeze a class just to be hashable. You can opt to just not mutate the attributes from anywhere in your code, and set unsafe_hash=True instead.

However, you should really declare thing3 as a field, and not use a custom __init__:

from dataclasses import dataclass, field
from typing import Any

@dataclass(unsafe_hash=True)
class Tricky:
    thing1: int = field(init=False)
    thing2: str = field(init=False)
    thing3: Any

    def __post_init__(self):
        self.thing1 = 42
        self.thing2 = 'foo'

Here thing1 and thing2 have init=False set, so they are not passed to the __init__ method. You then set them in a __post_init__() method.

Note that this now requires that you don't freeze the class, otherwise you can't set thing1 and thing2 either, not in a custom __init__ and not in __post_init__.

Demo:

>>> Tricky('bar')
Tricky(thing1=42, thing2='foo', thing3='bar')
>>> hash(Tricky('bar'))
-3702476386127038381

If all you want is a schema definition, you don’t need dataclasses at all. You can get the class annotations from any class; either as raw annotations or with typing.get_type_hints().

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • The real purpose of `Tricky` is a data access object, where the members define the schema of a table in a database. I don't actually ever want to actually set `thing1` or `thing2`; they're here purely to define a schema. (As I say, clearly not the intent of the dataclass, but the syntax is nice.) Clients will define subclasses of `Tricky`, each of which might define different members, but all of which need a `thing3` as defined in the superclass. If possible I don't want my subclasses to have to use the `field()` notation as this clutters up my nice clean schema definitions. – Sasgorilla Sep 11 '19 at 17:37
  • 3
    Just don’t use dataclasses. The notation is not specific to the library. – Martijn Pieters Sep 11 '19 at 17:39
  • If I don't use a dataclass, do you know what would be the equivalent of `__dataclass_fields__`? I.e., how can I get only the fields defined with the `name: type` syntax? – Sasgorilla Sep 11 '19 at 17:42
  • 4
    Just access the annotations: `__annotations__`. Or the type hints with `typing.get_type_hints()` – Martijn Pieters Sep 11 '19 at 17:44
10

Here's a simpler option - just add a static make function:

@dataclass(frozen=True)
class Tricky:
    thing1: str
    thing2: int
    thing3: bool

    @classmethod
    def make(cls, whatever: str, you: bool, want: float):
        return cls(whatever + "..", you * 4, want > 5)

x = Tricky.make("foo", false, 3)

Depending on what your make method does it may be a good idea to follow Rust's naming convention - from_foo(). E.g.

@dataclass(frozen=True)
class Coord:
    lat: float
    lon: float

    @classmethod
    def from_os_grid_reference(cls, x: int, y: int):
        return cls(...)

    @classmethod
    def from_gps_nema_string(cls, nema_string: str):
        return cls(...)
Timmmm
  • 88,195
  • 71
  • 364
  • 509
  • This answer is buggy. Use `@classmethod` if you expect a `cls` parameter. – Hugues Mar 18 '22 at 18:29
  • I would go one step further and make `make` or `from_..` methods into standalone functions (like `make_tricky` etc.) - it's easier to add type annotation (you don't have to enclose it in quotes), it's one indentation less, it's more independent of actual class implementation and it may have shorter name – Jan Spurny May 11 '23 at 10:57
  • Type annotations are muss less of an issue now (no quotes needed) with [Python 3.11 and the introduction of PEP 673](https://docs.python.org/3/whatsnew/3.11.html#whatsnew311-pep673) – Alex Povel Aug 31 '23 at 07:47
4

Turns out that dataclasses doesn't provide the functionality you were looking for. Attrs however does:

from attr import attrs, attrib


@attrs(frozen=True)
class Name:
    name: str = attrib(converter=str.lower)

Same answer to similar question: See https://stackoverflow.com/a/64695607/3737651

naught101
  • 18,687
  • 19
  • 90
  • 138
3

What @Shmuel H. posted did not work for me, it still raised FrozenInstanceError.

This is what worked for me:

What I'm doing here is accepting a value into the init and checking if its compatible with the format that is defined in the strptime function, if it is I will assign it, if not I will print the exception

@dataclass(frozen=True)
class Birthday:
    value: InitVar[str]
    date: datetime = field(init=False)

    def __post_init__(self, value: str):
        try:
            self.__dict__['date'] = datetime.strptime(value, '%d/%m/%Y')
        except Exception as e:
            print(e)
Max
  • 907
  • 2
  • 13
  • 27