2

The documentation for the Field class of python's standard dataclasses module specifies only:

Its documented attributes are:

  • [...]
  • type: The type of the field.

To me, it seems to mean that the field will contain the type itself, and not only it's name in the form of a string.

However, it seems that it simply copies the type annotation as is, making it quite useless.

Example:

@dataclasses.dataclass 
class C: 
    c: 'C'

dataclasses.fields(C)[0].type # This returns the string 'C'
typing.get_type_hints(C)['c'] # This returns the class C, as expected

The problem even occurs systematically when using PEP563 type annotations.

Is this a bug in the dataclasses module? Is this the expected behavior? If so, how do I retrieve a type object given a Field instance?

Arne
  • 17,706
  • 5
  • 83
  • 99
lovasoa
  • 6,419
  • 1
  • 35
  • 45
  • What exactly is the problem? What do you *expect* the `dataclass` module to do with the type? It's handling the type hint exactly the way every other type hint is handled. – chepner May 01 '19 at 14:34
  • Well, I would expect it to do what it specifies in its documentation, that is, returning the type itself, and not the type annotation. – lovasoa May 01 '19 at 14:36

1 Answers1

10

This is deliberate. Resolving type hints at import time is expensive, especially when from __future__ import annotations has been used to disable resolving them in the first place.

Initially, the addition of the PEP 563 to Python 3.7 broke dataclasses when you used the from __future__ import annotations switch and included ClassVar or InitVar type annotations for fields; these would not be resolved at this point and remained a string. This was already a problem before PEP 563 if you explicitly used strings, see dataclasses issue #92. This became a Python bug, #33453, once dataclasses made it into Python 3.7 proper.

The 'parent' project, attrs, which inspired dataclasses, also had this issue to solve. There, Łukasz Langa (co-author of most of the type hinting peps, including PEP 563), states:

OK, so I tried the above and it seems it's a nuclear option since it forces all annotations to be evaluated. This is what I wanted to avoid with from __future__ import annotations.

and in the discussion on the pull request that fixed issue 33453, Eric Smith, author of dataclasses, stated:

I've been researching doing just that. I think @ambv's point is that it introduces a performance hit due to calling eval on every field, while the point of string annotations is to remove a performance hit.

Moreover, there were other problems; you can't evaluate all type hints at import time, not when they use forward references:

In addition to the performance issue, in the following case (without a __future__ statement and without dataclasses), I get an error on get_type_hints() because C is undefined when get_type_hints() is called. This is python/typing#508. Notice that where get_type_hints() is called in this example is exactly where @dataclass is going to run and would need to call the stripped down get_type_hints().

So in the end, all that dataclasses does is do is apply string heuristics to the annotations, and will not load them for you.

To retrieve the type, just use get_type_hints() on the class itself, and us the field .name attribute as the key into the result:

resolved = typing.get_type_hints(C)
f = dataclasses.fields(C)[0]
ftype = resolved[f.name]
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thank you for this detailed answer ! Did no one envision to resolve the annotations when `dataclasses.fields` is called, as it is the only way for a user to create a Field instance ? It would incur no performance penalty at import time, and would make the .type field consistent. Without it, I don't see any case where this field could do anything useful... – lovasoa May 01 '19 at 15:38
  • @lovasoa: (sorry, this dropped of my radar for longer than I'd have liked): The implementation needs to have `dataclasses.Fields()` instances to generate the class object, which includes a `__init__` method with those same type annotations, and potential class variables and init-only fields. So when the `@dataclass` decorator is executed, the `Field()` instances are created too. – Martijn Pieters May 02 '19 at 11:58