34

Here is a twofold question, with a theoretical part, and a practical one:

When subclassing dict:

class ImageDB(dict):
    def __init__(self, directory):
        dict.__init__(self)  # Necessary?? 
        ...

should dict.__init__(self) be called, just as a "safety" measure (e.g., in case there are some non-trivial implementation details that matter)? is there a risk that the code break with a future version of Python if dict.__init__() is not called? I'm looking for a fundamental reason of doing one thing or the other, here (practically, calling dict.__init__() is safe).

My guess is that when ImageDB.__init__(self, directory) is called, self is already a new empty dict object, and that there is therefore no need to call dict.__init__ (I do want the dict to be empty, at first). Is this correct?

Edit:

The more practical question behind the fundamental question above is the following. I was thinking of subclassing dict because I would use the db[…] syntax quite often (instead of doing db.contents[…] all the time); the object's only data (attribute) is indeed really a dict. I want to add a few methods to the database (such as get_image_by_name(), or get_image_by_code(), for instance), and only override the __init__(), because the image database is defined by the directory that contains it.

In summary, the (practical) question could be: what is a good implementation for something that behaves like a dictionary, except that its initialization is different (it only takes a directory name), and that it has additional methods?

"Factories" were mentioned in many answers. So I guess it all boils down to: do you subclass dict, override __init__() and add methods, or do you write a (factory) function that returns a dict, to which you add methods? I'm inclined to prefer the first solution, because the factory function returns an object whose type does not indicate that it has additional semantics and methods, but what do you think?

Edit 2:

I gather from everybody's answer that it is not a good idea to subclass dict when the new class "is not a dictionary", and in particular when its __init__ method cannot take the same arguments as dict's __init__ (which is the case in the "practical question" above). In other words, if I understand correctly, the consensus seems to be: when you subclass, all methods (including initialization) must have the same signature as the base class methods. This allows isinstance(subclass_instance, dict) to guarantee that subclass_instance.__init__() can be used like dict.__init__(), for instance.

Another practical question then pops up: how should a class which is just like dict, except for its initialization method, be implemented? without subclassing? this would require some bothersome boilerplate code, no?

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
  • 1
    A factory function is the way to go. If you need to customize *instance* behaviour, then you may want to create a subclass. If you just want to override *initialization* you needn't subclassing anything, because your instances aren't different from the standard ones. Remember that __init__ isn't considered to be part of the interface of the instance, but of the class. – Alan Franzoni Jan 10 '10 at 15:01
  • As far as I see it for this problem it would be best to add the `__getitem__` method to your ImageDB instead of subclassing a dict, because it's _not_ a dict. This allows you to do what you want, _without_ having all those methods like `pop()` which seem to be inappropriate for your class. – Georg Schölly Jan 10 '10 at 22:02
  • @gs: Good point, about pop; it is indeed, at least for the moment, irrelevant (the database contents is defined upon initialization only). I think that it is indeed best that the implementation tightly fit the necessary features. – Eric O. Lebigot Jan 11 '10 at 08:57

5 Answers5

17

You should probably call dict.__init__(self) when subclassing; in fact, you don't know what's happening precisely in dict (since it's a builtin), and that might vary across versions and implementations. Not calling it may result in improper behaviour, since you can't know where dict is holding its internal data structures.

By the way, you didn't tell us what you want to do; if you want a class with dict (mapping) behaviour, and you don't really need a dict (e.g. there's no code doing isinstance(x, dict) anywhere in your software, as it should be), you're probably better off at using UserDict.UserDict or UserDict.DictMixin if you're on python <= 2.5, or collections.MutableMapping if you're on python >= 2.6 . Those will provide your class with an excellent dict behaviour.

EDIT: I read in another comment that you're not overriding any of dict's method! Then there's no point in subclassing at all, don't do it.

def createImageDb(directory):
    d = {}
    # do something to fill in the dict
    return d

EDIT 2: you want to inherit from dict to add new methods, but you don't need to override any. Than a good choice might be:

class MyContainer(dict):
    def newmethod1(self, args):
        pass

    def newmethod2(self, args2):
        pass


def createImageDb(directory):
    d = MyContainer()
    # fill the container
    return d

By the way: what methods are you adding? Are you sure you're creating a good abstraction? Maybe you'd better use a class which defines the methods you need and use a "normal" dict internally to it.

Factory func: http://en.wikipedia.org/wiki/Factory_method_pattern

It's simply a way of delegating the construction of an instance to a function instead of overriding/changing its constructors.

ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
Alan Franzoni
  • 3,041
  • 1
  • 23
  • 35
  • 3
    +1: subclassing when there's no need to subclass is a bad idea, a factory's much better. – Alex Martelli Jan 09 '10 at 16:21
  • Even if I don't override dict methods, the new class does have additional methods,… (I'm investigating factories, thank you for the pointer!) – Eric O. Lebigot Jan 10 '10 at 10:40
  • 1
    I'm not sure about UserDict: the documentation reads "This module also defines a class, UserDict, that acts as a wrapper around dictionary objects. The need for this class has been largely supplanted by the ability to subclass directly from dict (a feature that became available starting with Python version 2.2)." – Eric O. Lebigot Jan 10 '10 at 10:42
  • ok, then you're *extending* dict with new methods it's all right to inherit from dict, but i would advise against overriding init. I'll edit my post once more. – Alan Franzoni Jan 10 '10 at 11:19
  • Sorry, I do override `__init__` (I detailed this in a recent version of the question), but nothing else… – Eric O. Lebigot Jan 10 '10 at 11:48
  • I agree @Alan: I've often subclassed dict() so I could add custom methods to my object, whilst otherwise accessing the object just like a dictionary. In most cases, I don't override __init__ at all. (If I did, I'm *sure* I'd call the superclass's __init__.) – Dan H Sep 27 '12 at 12:25
  • sometimes i subclass dict just so my objects will be natively json serializable without having to resort to shenanigans – penchant Apr 18 '16 at 18:01
14

You should generally call base class' __init__ so why make an exception here?

Either do not override __init__ or if you need to override __init__ call base class __init__, If you worry about arguments just pass *args, **kwargs or nothing if you want empty dict e.g.

class MyDict(dict):
    def __init__(self, *args, **kwargs ):
        myparam = kwargs.pop('myparam', '')
        dict.__init__(self, *args, **kwargs )

We shouldn't assume what baseclass is doing or not doing, it is wrong not to call base class __init__

Anurag Uniyal
  • 85,954
  • 40
  • 175
  • 219
  • Calling the dict `__init__` is indeed what I'm currently doing. Since it looks like calling it with no arguments does not do anything, I'm just curious about fundamental facts about Python that would allow it not to be called! – Eric O. Lebigot Jan 10 '10 at 10:57
  • @EOL, IMO it is just plain wrong not to call baseclass __init__, untill there is a very very strong reason to do otherwise – Anurag Uniyal Jan 10 '10 at 10:59
  • @Anurag: I see your point. I am trying to push my knowledge of Python a little bit further, and was wondering whether such a "very very strong reason" for not calling `dict.__init__(self)` (with no other arguments) does exist (like "it will never do anything"). – Eric O. Lebigot Jan 10 '10 at 12:03
  • You might even use `super(MyDict, self).__init__(…)`. – Georg Schölly Jan 10 '10 at 22:00
3

Beware of pickling when subclassing dict; this for example needs __getnewargs__ in 2.7, and maybe __getstate__ __setstate__ in older versions. (I have no idea why.)

class Dotdict( dict ):
    """ d.key == d["key"] """

    def __init__(self, *args, **kwargs):
        dict.__init__( self, *args, **kwargs )
        self.__dict__ = self

    def __getnewargs__(self):  # for cPickle.dump( d, file, protocol=-1)
        return tuple(self)
denis
  • 21,378
  • 10
  • 65
  • 88
2

PEP 372 deals with adding an ordered dict to the collections module.

It warns that "subclassing dict is a non-trivial task and many implementations don't override all the methods properly which can lead to unexpected results."

The proposed (and accepted) patch to python3.1 uses an __init__ that looks like this:

+class OrderedDict(dict, MutableMapping):
+    def __init__(self, *args, **kwds):
+        if len(args) > 1:
+            raise TypeError('expected at most 1 arguments, got %d' % len(args))
+        if not hasattr(self, '_keys'):
+            self._keys = []
+        self.update(*args, **kwds)

Based on this, it looks like dict.__init__() does not need to be called.

Edit: If you are not overriding or extending any of dict's methods, then, I agree with Alan Franzoni: use a dict factory rather than subclassing:

def makeImageDB(*args,**kwargs):
   d = {}
   # modify d
   return d
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • This is interesting. Now, not calling `dict.__init__()` with Python 3.1 is safe, but what about the future? Since I do not override any method, in ImageDB, subclassing is very safe; only the initialization is special (it builds the dict). – Eric O. Lebigot Jan 09 '10 at 12:22
  • Sorry EOL, I'm not following you. In my mind, Python 3.1 is the future... :) – unutbu Jan 09 '10 at 12:34
  • Take into consideration what the init is actually doing. It updates the dict with all the args and keywords. That is something your class will have to do, so calling dict.\_\_init__(self, *args, **kwds) probably takes care of that for you, or you'll have to call self.update, like the OrderedDict does. – Tor Valamo Jan 09 '10 at 13:11
  • @Tor Valamo: I have added details to the kind of functionality I'm looking for. Basically, the only data contained in the class is a dictionary, and I would like to access it directly through db[…] instead of db.contents[…]. Objects are never created with arguments like those of a standard dict. – Eric O. Lebigot Jan 10 '10 at 10:50
0

If you plan to subclass something like dict base type you may also consider the UserDict from collections. UserDict is designed to be subclassed.

prosti
  • 42,291
  • 14
  • 186
  • 151
  • 1
    Note that the documentation reads "The need for [the UserDict] class has been largely supplanted by the ability to subclass directly from dict". – Eric O. Lebigot Jun 11 '19 at 07:44
  • Thanks and yes, seams that `UserDict` was mainly interesting before Python 2.2 when you could not subclass `dict`. – prosti Jun 14 '19 at 10:29