63

I have some existing Python 3.6 code that I'd like to move to Python 3.7 dataclasses. I have __init__ methods with nice docstring documentation, specifying the attributes the constructors take and their types.

However, if I change these classes to use the new Python dataclasses in 3.7, the constructor is implicit. How do I provide constructor documentation in this case? I like the idea of dataclasses, but not if I have to forego clear documentation to use them.

edited to clarify I'm using docstrings presently

bad_coder
  • 11,289
  • 20
  • 44
  • 72
anahata
  • 826
  • 1
  • 7
  • 14
  • It seems dataclasses automatically generate a docstring for the class that include the type hints from the class definition, for example `'C(name: str, number: int)'`, but the docstring for the automatically generated `__init__` method is `None`. So I suppose you could manually assign the `__init__` docstring after the class definition. A bit clunky though. – snakecharmerb Jul 01 '18 at 17:53
  • This autogenerated docstring doesn't show up if I already have a docstring on the class, which is fine, because a human-supplied docstring is (usually!) much better than an autogenerated one. Manually assigning the docstring is definitely clunky and something I'd like to avoid if possible, hence this question. – anahata Jul 01 '18 at 17:58
  • True on both points. Also, a manually assigned docstring will work for runtime tools like `help`, but perhaps not for documentation generators like Sphinx. – snakecharmerb Jul 01 '18 at 18:05

3 Answers3

56

The napoleon-style docstrings as they are described in the sphinx docs (see the ExampleError class for their take on it) explicitly touch on your case:

The __init__ method may be documented in either the class level docstring, or as a docstring on the __init__ method itself.

And if you do not want this behavior, you have to explicitly tell sphinx that the constructor docstring and the class docstring are not the same thing.

Meaning, you can just paste your constructor info into the body of the class docstring.


In case you build documents from your docstrings, these are the granularities that can be achieved:

1) The bare minimum:

@dataclass
class TestClass:
    """This is a test class for dataclasses.

    This is the body of the docstring description.
    """
    var_int: int
    var_str: str

enter image description here

2) Additional constructor parameter description:

@dataclass
class TestClass:
    """This is a test class for dataclasses.

    This is the body of the docstring description.

    Args:
        var_int (int): An integer.
        var_str (str): A string.

    """
    var_int: int
    var_str: str

enter image description here

3) Additional attribute description:

@dataclass
class TestClass:
    """This is a test class for dataclasses.

    This is the body of the docstring description.

    Attributes:
        var_int (int): An integer.
        var_str (str): A string.

    """
    var_int: int
    var_str: str

enter image description here


Parameter and attribute descriptions can of course be combined as well, but since a dataclasses' attributes should be straight forward mappings to the constructor's arguments, there should usually be little reason to do so.

In my opinion, 1) would do for small or simple dataclasses -- it already includes the constructor signature with their respective types, which is plenty for a dataclass. If you want to say more about each attribute, 3) would serve best.

Arne
  • 17,706
  • 5
  • 83
  • 99
  • 4
    This is delightfully in-depth, thank you! I especially appreciate the reference to the Sphinx docs that cover precisely this case. – anahata Jul 02 '18 at 12:57
  • @Arne bro, what configuration for Sphinx do you use? I can't achieve attributes documentation generation, like in your third example. – woozly Apr 10 '20 at 13:39
  • @woozly I just re-ran what I think was my sample cpde, and I can't get them to look the same. Maybe sphinx changed how they treat dataclasses? Anyway, my `conf.py`s all look more or less [like this](https://github.com/a-recknagel/stenotype/blob/55aa7e448b59a923709fd9632d5d4c0ed7eb128d/docs/conf.py)(just switch the theme to `"alabaster"` for the look in the post here), and the docs are built [as described here](https://github.com/a-recknagel/stenotype/blob/55aa7e448b59a923709fd9632d5d4c0ed7eb128d/TOOLING.rst#documentation). – Arne Apr 10 '20 at 15:01
  • @Arne thank you! I've got it worked with `sphinx.ext.napoleon` extension enabled. – woozly Apr 11 '20 at 16:04
  • @Arne I have a simmilar configuraiton and my problem is that I'm getting the fields twice in the docs, once with my docstring, the other one auto-generated with type hints – magomar Oct 28 '20 at 10:52
  • @magomar You're right, I just re-ran it the code and am also getting duplicate variable descriptions that didn't appear before. This answer is now out of date as sphinx started giving dataclasses special treatment. – Arne Oct 29 '20 at 09:39
  • correction - all classes now get their classvars listen in the docstring rendered by sphinx, not only dataclasses. I'm following this up in [this new questions](https://stackoverflow.com/questions/64588821/how-to-supress-classvar-listing-in-class-docstring) and will update this answer accordingly. – Arne Oct 29 '20 at 10:16
  • @bad_coder Gracias, lo acabo de ver. Lo tengo que estudiar más despacio, pero dado que genero los rst de forma automática no creo que me resultw práctico tener que añadir excepciones para cada dataclass – magomar Oct 30 '20 at 17:37
  • its probalby an issue with PyCharm, but this solution leads to "unresolved refrence" errors in the IDE, while the solution by @Edward does not – Dan Ciborowski - MSFT Mar 15 '22 at 22:30
8

I think the easiest way is:

@dataclass
class TestClass:
    """This is a test class for dataclasses.

    This is the body of the docstring description.

    """
    var_int: int  #: An integer.

    #: A string.
    #: (Able to have multiple lines.)
    var_str: str

    var_float: float
    """A float. (Able to have multiple lines.)"""

Not sure why rendered results by @Arne look like that. In my case, attributes in a dataclass will always show regardless of the docstring. That is:

1) The bare minimum:

2) Additional constructor parameter description:

3) Additional attribute description:

Probably because I have set something wrong in my conf.py (Sphinx v3.4.3, Python 3.7):

extensions = [
    "sphinx.ext.napoleon",
    "sphinx.ext.autodoc",
    "sphinx_autodoc_typehints",
    "sphinx.ext.viewcode",
    "sphinx.ext.autosectionlabel",
]

# Napoleon settings
napoleon_google_docstring = True
napoleon_include_init_with_doc = True
Edward
  • 554
  • 8
  • 15
  • 1
    "Not sure why rendered results by @Arne look like that" -> Something changed in how sphinx rendered attributes by default: https://stackoverflow.com/q/64588821/962190 – Arne May 19 '21 at 09:32
  • this solution works with IDE like PyCharm, while other solutions raise errors with unresolved references. – Dan Ciborowski - MSFT Mar 15 '22 at 22:30
7

A major advantage of dataclasses is that they are self-documenting. Assuming the reader of your code knows how dataclasses work (and your attributes are appropriately named), the type-annotated class attributes should be excellent documentation of the constructor. See this example from the official dataclass docs:

@dataclass
class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

If you don't expect that readers of your code would know how dataclasses work then you might want to reconsider using them or adding an explanation or link to the docs in an inline comment after the @dataclass decorator. If you really need a docstring for a dataclass, I'd recommend putting the constructor docstring within the class docstring. For the example above:

'''Class for keeping track of an item in inventory.

Constructor arguments:
:param name: name of the item
:param unit_price: price in USD per unit of the item
:param quantity_on_hand: number of units currently available
'''
orn688
  • 830
  • 1
  • 7
  • 10
  • Hmmm. This places a greater emphasis on the importance of naming your attributes if they're truly going to be self-documenting. Thank you for the suggestions! – anahata Jul 01 '18 at 19:08
  • @orn688 But is it possible to eliminate `dataclass` completely from the documentation? – Dmytro Chasovskyi Jan 06 '20 at 14:26
  • 2
    It's worth noting that `help(InventoryItem)` will _not_ show the attributes `name`, `unit_price`, or `quantity_on_hand` or their docstrings. – Mike Holler Feb 12 '20 at 21:51