How to properly subclass dict and override getitem & setitem

Question

I am debugging some code and I want to find out when a particular dictionary is accessed. Well, it's actually a class that subclass dict and implements a couple extra features. Anyway, what I would like to do is subclass dict myself and add override __getitem__ and __setitem__ to produce some debugging output. Right now, I have

class DictWatch(dict):
    def __init__(self, *args):
        dict.__init__(self, args)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        log.info("GET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
        return val

    def __setitem__(self, key, val):
        log.info("SET %s['%s'] = %s" % str(dict.get(self, 'name_label')), str(key), str(val)))
        dict.__setitem__(self, key, val)

'name_label' is a key which will eventually be set that I want to use to identify the output. I have then changed the class I am instrumenting to subclass DictWatch instead of dict and changed the call to the superconstructor. Still, nothing seems to be happening. I thought I was being clever, but I wonder if I should be going a different direction.

Thanks for the help!

Did you try to use print instead of log? Also, could you explain how do you create/configure you log? — pajton, Mar 06 '10 at 00:39

Matt Anderson · Answer 1 · 2022-07-18T02:55:28.340

86

Another issue when subclassing dict is that the built-in __init__ doesn't call update, and the built-in update doesn't call __setitem__. So, if you want all setitem operations to go through your __setitem__ function, you should make sure that it gets called yourself:

class DictWatch(dict):
    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    def __getitem__(self, key):
        val = dict.__getitem__(self, key)
        print('GET', key)
        return val

    def __setitem__(self, key, val):
        print('SET', key, val)
        dict.__setitem__(self, key, val)

    def __repr__(self):
        dictrepr = dict.__repr__(self)
        return '%s(%s)' % (type(self).__name__, dictrepr)
        
    def update(self, *args, **kwargs):
        print('update', args, kwargs)
        for k, v in dict(*args, **kwargs).items():
            self[k] = v

edited Jul 18 '22 at 02:55

answered Mar 06 '10 at 01:27

Matt Anderson

19,311
11
41
57

17

If you are using Python 3, you'll want to change this example so that `print` is the `print()` function and the `update()` method uses `items()` instead of `iteritems()`. – Al Sweigart Sep 18 '17 at 04:01
I have tried your sol, but it seems that it only works for **only one level of indexing (i.e., dict[key] and not dict[key1][key2] ... )*** – ndrwnaguib Apr 04 '19 at 16:42
d[key1] returns something, perhaps a dictionary. The second key indexes that. This technique can’t work unless that returned thing supports the watch behavior also. – Matt Anderson Apr 04 '19 at 16:48
1

@AndrewNaguib: Why should it work with nested arrays? Nested array do not work with normal python dict either (if you did not implement it yourself) – Igor Chubin May 01 '19 at 11:32
Yes I did not know so :), for nested indexing level `DictWatch(val)` should be returned instead. – ndrwnaguib May 01 '19 at 11:34
1

@AndrewNaguib: `__getitem__` would need to test `val` and only do that conditionally — i.e. `if isinstance(val, dict): ...` – martineau Sep 18 '19 at 18:46
1

Having to overwrite 5 methods for a simple case feel overcomplicated. This is why `collections.UserDict` exists. `UserDict` only require to overwrite `__setitem__` to be compatible with `__init__`, `setdefault`, `update`,... – Conchylicultor Nov 02 '20 at 17:01
1

Subclassing `MutableMapping` or `UserDict` is preferred over subclassing `dict` in most cases. However `UserDict` does not subclass `dict` so if you need the real builtin python `dict` as your parent class, this does not help you. @Conchylicultor – Matt Anderson Nov 18 '20 at 18:47
Does the `update` method take any more argument than a positional argument for the other dictionary that is used to update the first dictionary? – HelloGoodbye Jul 19 '22 at 09:10

score 45 · Accepted Answer · edited Aug 12 '11 at 06:07

45

What you're doing should absolutely work. I tested out your class, and aside from a missing opening parenthesis in your log statements, it works just fine. There are only two things I can think of. First, is the output of your log statement set correctly? You might need to put a logging.basicConfig(level=logging.DEBUG) at the top of your script.

Second, __getitem__ and __setitem__ are only called during [] accesses. So make sure you only access DictWatch via d[key], rather than d.get() and d.set()

edited Aug 12 '11 at 06:07

adamJLev

13,713
11
60
65

answered Mar 06 '10 at 00:42

BrainCore

5,214
4
33
38

Actually it's not extra parens, but a missing opening paren around `(str(dict.get(self, 'name_label')), str(key), str(val)))` – cobbal Mar 06 '10 at 00:44
3

True. To the OP: For future reference, you can simply do log.info('%s %s %s', a, b, c), instead of a Python string formatting operator. – BrainCore Mar 06 '10 at 00:50
Logging level ended up being the issue. I'm debugging someone else's code and I was originally testing in another file which head a different level of debugging set. Thanks! – Michael Mior Mar 06 '10 at 03:01

score 25 · Answer 3 · edited Nov 20 '21 at 22:52

25

Consider subclassing UserDict or UserList. These classes are intended to be subclassed whereas the normal dict and list are not, and contain optimisations.

edited Nov 20 '21 at 22:52

wjandrea

28,235
9
60
81

answered Mar 26 '18 at 19:21

andrew pate

3,833
36
28

18

For reference, the [documentation](https://docs.python.org/3.6/library/collections.html?highlight=userdict#collections.UserDict) in Python 3.6 says "The need for this class has been partially supplanted by the ability to subclass directly from dict; however, this class can be easier to work with because the underlying dictionary is accessible as an attribute". – Sean Sep 16 '18 at 17:33
1

@andrew an example might be helpful. – Vasantha Ganesh Sep 26 '19 at 09:40
3

@VasanthaGaneshK https://treyhunner.com/2019/04/why-you-shouldnt-inherit-from-list-and-dict-in-python/ – SirDorius Feb 11 '20 at 15:53

score 9 · Answer 4 · answered Mar 06 '10 at 00:48

9

That should not really change the result (which should work, for good logging threshold values) : your init should be :

def __init__(self,*args,**kwargs) : dict.__init__(self,*args,**kwargs)

instead, because if you call your method with DictWatch([(1,2),(2,3)]) or DictWatch(a=1,b=2) this will fail.

(or,better, don't define a constructor for this)

answered Mar 06 '10 at 00:48

makapuf

1,370
1
13
23

I'm only worried about the `dict[key]` form of access, so this isn't an issue. – Michael Mior Mar 06 '10 at 02:16

score 9 · Answer 5 · edited Oct 21 '22 at 15:40

As Andrew Pate's answer proposed, subclassing collections.UserDict instead of dict is much less error prone.

Here is an example showing an issue when inheriting dict naively:

class MyDict(dict):

  def __setitem__(self, key, value):
    super().__setitem__(key, value * 10)


d = MyDict(a=1, b=2)  # Bad! MyDict.__setitem__ not called
d.update(c=3)  # Bad! MyDict.__setitem__ not called
d['d'] = 4  # Good!
print(d)  # {'a': 1, 'b': 2, 'c': 3, 'd': 40}

UserDict inherits from collections.abc.MutableMapping, so this works as expected:

class MyDict(collections.UserDict):

  def __setitem__(self, key, value):
    super().__setitem__(key, value * 10)


d = MyDict(a=1, b=2)  # Good: MyDict.__setitem__ correctly called
d.update(c=3)  # Good: MyDict.__setitem__ correctly called
d['d'] = 4  # Good
print(d)  # {'a': 10, 'b': 20, 'c': 30, 'd': 40}

Similarly, you only have to implement __getitem__ to automatically be compatible with key in my_dict, my_dict.get, …

Note: UserDict is not a subclass of dict, so isinstance(UserDict(), dict) will fail (but isinstance(UserDict(), collections.abc.MutableMapping) will work).

score 1 · Answer 6 · answered Oct 06 '17 at 08:03

All you will have to do is

class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

A sample usage for my personal use

### EXAMPLE
class BatchCollection(dict):
    def __init__(self, inpt={}):
        super(BatchCollection, self).__init__(inpt)

    def __setitem__(self, key, item):
        if (isinstance(key, tuple) and len(key) == 2
                and isinstance(item, collections.Iterable)):
            # self.__dict__[key] = item
            super(BatchCollection, self).__setitem__(key, item)
        else:
            raise Exception(
                "Valid key should be a tuple (database_name, table_name) "
                "and value should be iterable")

Note: tested only in python3

Since this is Python 3, I recommend just using `super()` instead of `super(BatchCollection, self)` — MestreLion, Oct 15 '21 at 11:47

How to properly subclass dict and override getitem & setitem

6 Answers6

Linked

Related

How to properly subclass dict and override __getitem__ & __setitem__

6 Answers6

Linked

Related

How to properly subclass dict and override getitem & setitem