9

I'd like to create a copy of an existing instance of a dataclass and modify it.

Suppose we have a dataclass and an instance of that dataclass:

from dataclasses import dataclass, field, InitVar, replace

@dataclass
class D:
    a: float = 10.                # Normal attribute with a default value
    b: InitVar[float] = 20.       # init-only attribute with a default value 
    c: float = field(init=False)  # an attribute that will be defined in __post_init__
    
    def __post_init__(self, b):
        self.c = self.a + b

d1 = D()

Let's define an instance and try to make a copy (I've tried solutions proposed in this post):

  1. Using the replace method:
d2 = replace(d1, **{})

throws an error

InitVar 'b' must be specified with replace()

It seems to be a reported bug, but I am not sure if there is any progress on it.

  1. By creating a new object from __dict__ of an old object:
d2 = D(**d1.__dict__)

throws an error

__init__() got an unexpected keyword argument 'c'

Do you have any suggestions on how to copy dataclass instance properly or "workaround" indicated issues?


Edit:

  • Fixed the bug in the initial code (self.b in __post_init__)

I've made this workaround which seems to be working (posted in answers). If someone can find drawbacks it will be very appreciated.

Roman Zh.
  • 985
  • 2
  • 6
  • 20

2 Answers2

12

Just use the standard copy module:

d2 = copy.copy(d1)

If you want a deep copy, you can use copy.deepcopy.


Any approach based on calling the dataclass's constructor is doomed to failure. That includes replace, which delegates to the constructor. The problem is that InitVar values aren't stored anywhere, so there's no way to tell what values to pass for InitVars. (Particularly, self.b is not the provided value of b - it's the default value - so your __post_init__ is broken.)

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 1
    No, this could create a view instead of a copy. `copy.deepcopy(d1)` could work, however, if the complex logic of a dataclass requires its re-creation it could go wrong too. – Roman Zh. Nov 17 '20 at 08:16
  • 3
    @RomanZhuravlev: What? Dataclasses have no concept of a view. You can't create a view of a dataclass instance, because there's no such thing. It creates a *shallow* copy, but so did the original version of your answer, and the code in your question would have created a shallow copy of objects it actually worked for. If you want a deep copy, you can use `copy.deepcopy`, but you never indicated you wanted one. – user2357112 Nov 17 '20 at 09:44
-1

I've made this workaround which seems to be working. If someone can find drawbacks it will be very apreciated.

def copy_dataclass(D_class, d_obj, **kw):
    input = {**kw}
    for key, value in asdict(d_obj).items():
        # If the attribute is passed to __init__
        if d_obj.__dataclass_fields__[key].init:
            input[key] = copy.deepcopy(value)
        
    copy_d = D_class(**input)
    
    return copy_d

Which gives:

d2 = copy_dataclass(D, d1)

d1 == d2
# prints True

d1 is d2
# prints False

If we change a field in d2 it does not affect d1

d2.a = 50

print(f'd1 = {d1};\nd2 = {d2}')
# d1 = D(a=10.0, c=30.0);
# d2 = D(a=50, c=30.0)

Edit

  • Added copy.deepcopy(value) for cases when dataclass attributes are mutable;
  • __dict__ rerplaced by asdict() method;
  • additional inputs can be provided to the method in order to replace the default values of InitVars attributes
  • additional check can be added to verify that all InitVars were replaced (using D.__dataclass_fields__). So, with these modifications we will get the replace() method...
Roman Zh.
  • 985
  • 2
  • 6
  • 20
  • Doesn't work. It only looks like it does because bugs in your `D` class and your `copy_dataclass` hide each other. You're misusing InitVars - your `__post_init__` completely ignores the provided value of `b`. If we [fix that bug](https://ideone.com/mA88O7), we see that your `copy_dataclass` fails to perform correct copies. Any approach that relies on calling the `D` constructor is doomed to failure, because InitVar values aren't actually stored anywhere, so we can't recover the values of InitVars. – user2357112 Nov 17 '20 at 09:52