2

As noted in (Fast way to copy dictionary in Python), dict.copy() is much faster than copy.copy(). I have a class that is a descendant of dict with a little extra metadata:

class DictWithTimestamp(dict):
    def __init__(self, timestamp):
        self.timestamp = timestamp

If I just do DictWithTimestamp(1234).copy() then I get a dictionary without a timestamp. Is there a way to preserve the speed of dict.copy() and keep my metadata?

Community
  • 1
  • 1
Thomas Johnson
  • 10,776
  • 18
  • 60
  • 98
  • 1
    just add copy function to your dict calling the dict.copy() and then copying your metadata manually. – user3012759 Mar 09 '15 at 15:43
  • 2
    Can you give an example of how to do that? If I just called dict.copy() in `__copy__` I would get dict, and you can't set arbitrary attributes (like timestamp) on a dict – Thomas Johnson Mar 09 '15 at 15:47
  • actually, by the looks of it from the code sample you provided it you may want to re-consider how you subclass from dict, as in you current example you just changed the `__init__` to accept timestamp only instead of behaving like an ordinary dict. there are some good SO answers covering that exact topic but it's much more complex that what you currently have (wrapper, ABC...) – user3012759 Mar 09 '15 at 15:58
  • Is there a reason that `timestamp` needs to be an attribute of the dictionary itself? Why not `DictWithTimestamp = namedtuple('DictWithTimestamp', 'timestamp data')`? I think the composition is warranted: in some sense, your approach involves inheriting a container and giving it a property (timestamp) that doesn't have much to do with being a container. – jme Mar 09 '15 at 16:32
  • why not just add it to the dict and do a lookup to get the timestamp, `self["timestamp"] = timestamp` – Padraic Cunningham Mar 09 '15 at 16:37
  • @PadraicCunningham I thought about that, but many users of this class iterate over the keys and values and expect certain data there, and having that extra timestamp in the dict itself would mess that up. – Thomas Johnson Mar 09 '15 at 17:05
  • @jme That's not a bad idea, I might do that, although it would involve a big refactoring – Thomas Johnson Mar 09 '15 at 17:05

1 Answers1

0

At first i thought about __copy__ overriding but this way you will not use built-in dict copy method. So, jump to update!

You can define __copy__ method for your DictWithTimestamp class in which you can copy additional class data. From docs:

In order for a class to define its own copy implementation, it can define special methods __copy__() and __deepcopy__(). The former is called to implement the shallow copy operation; no additional arguments are passed. The latter is called to implement the deep copy operation; it is passed one argument, the memo dictionary. If the __deepcopy__() implementation needs to make a deep copy of a component, it should call the deepcopy() function with the component as first argument and the memo dictionary as second argument.

update: you can do it with collections.MutableMapping subclass (read more about it here: How to "perfectly" override a dict?):

class DictWithTimestamp(collections.MutableMapping):
    def __init__(self, timestamp=None):
        self.store = dict()
        self.timestamp = timestamp

    def __getitem__(self, key):
        return self.store[key]

    def __setitem__(self, key, value):
        self.store[key] = value

    def __delitem__(self, key):
        del self.store[key]

    def __iter__(self):
        return iter(self.store)

    def __len__(self):
        return len(self.store)

    def setstore(self, store):
        self.store = store

    def copy(self):
        copy = self.__class__(self.timestamp)
        copy.setstore(self.store.copy())
        return copy

Test:

>>> d = DictWithTimestamp(1234)
>>> d['i'] = 1
>>> d.timestamp
1234
>>> d1 = d.copy()
>>> d1.items()
[('i', 1)]
>>> d1.timestamp
1234
Community
  • 1
  • 1
ndpu
  • 22,225
  • 6
  • 54
  • 69
  • Unfortunately subclassing `MutableMapping` hurts the performance of everything else. For example, using `%timeit` it seems that getting an element with [] is 6x slower than a dict and setting an element with foo[1]=2 is 5x slower. For a direct subclass (as in the question) setting an element is as fast as a regular dict, and getting an element is only about 2.5x slower – Thomas Johnson Mar 09 '15 at 16:16