4

How would one copy.deepcopy() all but one part of an object in python?

I have an object which is basically a wrapper with some settings and some extra bits of metadata around a potentially huge pandas DataFrame. The DataFrame can contain arbitrarily huge amounts of data. I want to make a copy the object that consists of a shallow copy of the dataframe and a deepcopy() of the settings and metadata (both of which which can be mutable objects).

I don't know at run time if all of the settings and metadata exist when the copy is needed. There is also the possibility to people may set additional parts of the object using my_object.extra_setting. This means that I can not just explicitly deepcopy all the parts of the object except the large dataframe.

The class is:

    class my_class(object):

        def __init__(self, lots_of_data, small_amount_of_data, setting_1, setting_2, setting_3):
            self.lots_of_data = lots_of_data
            self.small_amount_of_data = small_amount_of_data
            self.setting_1 = setting_1
            self.setting_2 = setting_2

        def set_setting_3(self, setting_3):
            self.setting_3 = setting_3

        def set_more_metadata(metadata):
            self.more_metadata = metadata

And in pseudocode the copy method is:

        def __deepcopy__(self):

            copy_of_object = copy.deepcopy(self[all but object_in.lots_of_data])
            copy_of_object.lots_of_data = self.lots_of_data

            return copy_of_object
user5061
  • 655
  • 8
  • 12

2 Answers2

3

your class needs to implement __deepcopy__(), which will do the selection of the fields to copy.

Pavel
  • 7,436
  • 2
  • 29
  • 42
  • 1
    how does one do this? I've looked at the docs but I can't figure it out. I'm probably being stupid but a bit more information would be awesome. – user5061 May 14 '14 at 12:36
  • the implementation looks just like your `copy_my_class` with the only difference that you'd need to explicitly list all the members to copy. there's also a way to select "all but one member" by means of introspection (see e.g. module `inspect`), but I'd recommend to try simple things first. – Pavel May 14 '14 at 13:15
  • see also http://stackoverflow.com/questions/1500718/what-is-the-right-way-to-override-the-copy-deepcopy-operations-on-an-object-in-p – Pavel May 14 '14 at 13:19
0

I doubt you're still looking for an answer but I was looking for an answer and this is my solution for anyone

import copy
def special_copy_function(object_to_be_copied: my_class) -> my_class:
    memo = {id(object_to_be_copied.lots_of_data): copy.copy(object_to_be_copied.lots_of_data)  # Shallow copy of large data file
    output = copy.deepcopy(object_to_be_copied, memo)  # Deep copy of everything else
    return output

memo is a dictionary passed to deepcopy to help it remember what it has already copied, to avoid copying an item (referenced in several places) multiple times.

By placing a shallow copy of lots_of_data in the memo dictionary deepcopy will use that value instead of trying to deepcopy lots_of_data itself.

Greg
  • 133
  • 7