5

Background

I'm using dataclasses to create a nested data structure, that I use to represent a complex test output.

Previously I'd been creating a hierarchy by creating multiple top-level dataclasses and then using composition:

from dataclasses import dataclass

@dataclass
class Meta:
    color: str 
    size: float

@dataclass
class Point:
    x: float
    y: float
    stuff: Meta

point1 = Point(x=5, y=-5, stuff=Meta(color='blue', size=20))

Problem

I was wondering if there was a way of defining the classses in a self-contained way, rather than polluting my top-level with a bunch of lower-level classes. So above, the definition of Point dataclass contains the definition of Meta, rather than the definition being at the top level.

Solution?

I wondered if it's possible to use inner (dataclass) classes with a dataclass and have things all work.

So I tried this:

rom dataclasses import dataclass
from typing import get_type_hints


@dataclass
class Point:

    @dataclass
    class Meta:
        color: str 
        size: float

    @dataclass
    class Misc:
        elemA: bool
        elemB: int 

    x: float
    y: float
    meta: Meta
    misc: Misc


point1 = Point(x=1, y=2,
               meta=Point.Meta(color='red', size=5.5),
               misc=Point.Misc(elemA=True, elemB=-100))

print("This is the point:", point1)
print(point1.x)
print(point1.y)
print(point1.meta)
print(point1.misc)
print(point1.meta.color)
print(point1.misc.elemB)

point1.misc.elemB = 99
print(point1)
print(point1.misc.elemB)

This all seems to work - the print outputs all work correctly, and the assignment to a (sub) member element works as well.

You can even support defaults for nested elements:

from dataclasses import dataclass


@dataclass
class Point:

    @dataclass
    class Meta:
        color: str = 'red'
        size: float = 10.0

    x: float
    y: float
    meta: Meta = Meta()


pt2 = Point(x=10, y=20)
print('pt2', pt2)

...prints out red and 10.0 defaults for pt2 correctly

Question

Is this a correct way to implement nested dataclasses?

(meaning it's just not lucky it works now, but would likely break in future? ...or it's just fugly and Not How You Do Things? ...or it's just Bad?)

...It's certainly a lot cleaner and a million times easier to understand and upport than a gazillion top-level 'mini' dataclasses being composed together.

...It's also a lot easier than trying to use marshmellow or jerry-rigging a json schema to class structure model.

...It also is very simple (which I like)

Richard
  • 3,024
  • 2
  • 17
  • 40

1 Answers1

0

You can just use strings to annotate classes that don't exist yet:

from dataclasses import dataclass

@dataclass
class Point:
    x: float
    y: float
    stuff: "Meta"

@dataclass
class Meta:
    color: str 
    size: float


point1 = Point(x=5, y=-5, stuff=Meta(color='blue', size=20))

That way, you can reorder class definitions in the way that makes most sense. Static type checkers like mypy also respect this way of forward references, which are part of the initial pep on type annotation, so nothing exotic. Nesting the classes also solves the problem but is imo harder to read, since flat is better than nested.

Arne
  • 17,706
  • 5
  • 83
  • 99
  • Thanks for the info Arne. I've got to admit I'd forgotten about forward references. I guess I'm still at a philosophical point, where you have a single element that has two attributes. In Typescript for example this is no problem - you just make it a structure within your class: stuff: {color: str; size: float} - but in Python you can't so you've got to use composition. But if I have a dataclass with 5 'levels' and each maybe 10 elements, I don't know if it's better to have to define 50 top-level dataclasses, just so I can at some point compose them all into a single complex one. – Richard Sep 24 '20 at 23:05
  • Coming from JS/TS to Python (newbie), even I was stumped by the complex json to dataclass conversions. In my case, I use the nested dataclass syntax as well. Unfortunately, I have a ton of keys so I have cannot specify each key; have to use hacks like assign nested to temp obj and delete from main obj then expand using (**json_obj) etc. Wish there was more elegant solution. – Pritesh Tupe Jul 22 '22 at 11:56