1

If I want an instance attribute to be:

  • Non-public (aka have a single leading underscore)
  • Be a parameter in the __init__ signature

Normally, I would do this:

class Foo:
    def __init__(self, bar: str):
        self._bar = bar

foo = Foo(bar="bar")  # foo.bar would raise an AttributeError

However, in dataclasses, I'm unsure how to do this.

from dataclasses import dataclass

@dataclass
class Foo:
    bar: str  # This leaves bar as a public instance attribute

What is the correct way to do this in dataclasses.dataclass?

martineau
  • 119,623
  • 25
  • 170
  • 301
Intrastellar Explorer
  • 3,005
  • 9
  • 52
  • 119
  • 3
    If you've got private attributes, you probably shouldn't be using `dataclasses`. A dataclass is intended to just be a simple data holder (hence the "data" name), not something with opaque private state. – user2357112 Nov 13 '20 at 02:49
  • 1
    Yeah, going to second that actually. I did find an explanation here of how to do it (see solution #5), but it doesn't seem like a good idea. https://florimond.dev/blog/articles/2018/10/reconciling-dataclasses-and-properties-in-python/ – Alex Watt Nov 13 '20 at 02:57
  • Yeah I just read that article. Seems to be an answer, but I agree, it's unwieldy. Guess I should stick to a regular class for my use case, thank you both! – Intrastellar Explorer Nov 13 '20 at 03:41

3 Answers3

6

This is a relatively simple case for an InitVar and the __post_init__ method. (Though the verbosity is probably not what you had in mind.)

from dataclasses import dataclass, InitVar


@dataclass
class Foo:
    bar: InitVar[str]

    def __post_init__(self, bar):
        self._bar = bar

This behaves as you described. bar (as written) is a required argument to __init__, but does not make an attribute named bar.

>>> f = Foo()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'bar'
>>> f = Foo(9)
>>> f._bar
9
>>> f.bar
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Foo' object has no attribute 'bar'

To be clear, this is something you would use in an existing dataclass; you wouldn't choose to create a dataclass just to do this.

chepner
  • 497,756
  • 71
  • 530
  • 681
1

If you want it to be an __init__() argument, just write your own which will prevent one from being automatically generated. Note that specifying init=False as shown below isn't really required since it wouldn't have happened anyway, but nevertheless seems like good way to draw attention to what's going on. For the same reason, specifying field(init=False) for the private _bar field is superfluous.

from dataclasses import dataclass, field


@dataclass(init=False)
class Foo:
    def __init__(self, bar: str):
        self._bar = bar

    _bar: str = field(init=False)


foo = Foo(bar="xyz")
print(foo._bar)  # -> xyz
print(foo.bar)  # -> AttributeError: 'Foo' object has no attribute 'bar'
martineau
  • 119,623
  • 25
  • 170
  • 301
1

Here you can find a big discussion on using @property in dataclasses which can resolve your problem (the _bar is safe under the setter/getter).

There could be some problems with unwanted attributes displayed as the output of __repr__ or asdict(), which can be solved as here


PS I can't add a comment, so I've added a new answer.

Roman Zh.
  • 985
  • 2
  • 6
  • 20