Is there a reason not to send super().init() a dictionary instead of **kwds?

Question

I just started building a text based game yesterday as an exercise in learning Python (I'm using 3.3). I say "text based game," but I mean more of a MUD than a choose-your-own adventure. Anyway, I was really excited when I figured out how to handle inheritance and multiple inheritance using super() yesterday, but I found that the argument-passing really cluttered up the code, and required juggling lots of little loose variables. Also, creating save files seemed pretty nightmarish.

So, I thought, "What if certain class hierarchies just took one argument, a dictionary, and just passed the dictionary back?" To give you an example, here are two classes trimmed down to their init methods:

class Actor:
    def __init__(self, in_dict,**kwds):
        super().__init__(**kwds)
        self._everything = in_dict
        self._name = in_dict["name"]
        self._size = in_dict["size"]
        self._location = in_dict["location"]
        self._triggers = in_dict["triggers"]
        self._effects = in_dict["effects"]
        self._goals = in_dict["goals"]
        self._action_list = in_dict["action list"] 
        self._last_action = ''
        self._current_action = '' # both ._last_action and ._current_action get updated by .update_action()

class Item(Actor):
    def __init__(self,in_dict,**kwds)
        super().__init__(in_dict,**kwds)
        self._can_contain = in_dict("can contain") #boolean entry
        self._inventory = in_dict("can contain") #either a list or dict entry

class Player(Actor):
    def __init__(self, in_dict,**kwds):
        super().__init__(in_dict,**kwds)
        self._inventory = in_dict["inventory"] #entry should be a Container object
        self._stats = in_dict["stats"]

Example dict that would be passed:

playerdict = {'name' : '', 'size' : '0', 'location' : '', 'triggers' : None, 'effects' : None, 'goals' : None, 'action list' = None, 'inventory' : Container(), 'stats' : None,}

(The None's get replaced by {} once the dictionary has been passed.)

So, in_dict gets passed to the previous class instead of a huge payload of **kwds. I like this because:

It makes my code a lot neater and more manageable.
As long as the dicts have at least some entry for the key called, it doesn't break the code. Also, it doesn't matter if a given argument never gets used.
It seems like file IO just got a lot easier (dictionaries of player data stored as dicts, dictionaries of item data stored as dicts, etc.)

I get the point of **kwds (EDIT: apparently I didn't), and it hasn't seemed cumbersome when passing fewer arguments. This just appears to be a comfortable way of dealing with a need for a large number of attributes at the the creation of each instance.

That said, I'm still a major python noob. So, my question is this: Is there an underlying reason why passing the same dict repeatedly through super() to the base class would be a worse idea than just toughing it out with nasty (big and cluttered) **kwds passes? (e.g. issues with the interpreter that someone at my level would be ignorant of.)

EDIT:

Previously, creating a new Player might have looked like this, with an argument passed for each attribute.

bob = Player('bob', Location = 'here', ... etc.)

The number of arguments needed blew up, and I only included the attributes that really needed to be present to not break method calls from the Engine object.

This is the impression I'm getting from the answers and comments thus far:

There's nothing "wrong" with sending the same dictionary along, as long as nothing has the opportunity to modify its contents (Kirk Strauser) and the dictionary always has what it's supposed to have (goncalopp). The real answer is that the question was amiss, and using in_dict instead of **kwds is redundant.

Would this be correct? (Also, thanks for the great and varied feedback!)

maybe I missed the point of the question but `kwds` is a dict, and can be used just like one. Using `**` just gives an additional way to call it. You could just call `Player(**playerdict)`, why is that cluttered? — cmd, Oct 22 '13 at 15:49
Bear in mind that `**kwds` *is* a way for a function to accept a dictionary as an argument. It's just that (a) the syntax to call it is different from a function that takes a single argument that's supposed to be a dictionary; (2) there's an extra copy if each function in turn takes `**kwds` and calls with `__init__(**kwds)` compared with it if takes `kwds` and calls with `kwds`. — Steve Jessop, Oct 22 '13 at 15:50
@cmd The original model involved passing lots of individual arguments (many of which were dictionaries that are now being passed within the dictionary in question,) which would be cluttered. Basically, what Carl describes. — eenblam, Oct 22 '13 at 18:13

score 1 · Answer 1 · answered Oct 22 '13 at 15:53

1

I've done that myself where in_dict was a dict with lots of keys, or a settings object, or some other "blob" of something with lots of interesting attributes. That's perfectly OK if it makes your code cleaner, particularly if you name it clearly like settings_object or config_dict or similar.

That shouldn't be the usual case, though. Normally it's better to explicitly pass a small set of individual variables. It makes the code much cleaner and easier to reason about. It's possible that a client could pass in_dict = None by accident and you wouldn't know until some method tried to access it. Suppose Actor.__init__ didn't peel apart in_dict but just stored it like self.settings = in_dict. Sometime later, Actor.method comes along and tries to access it, then boom! Dead process. If you're calling Actor.__init__(var1, var2, ...), then the caller will raise an exception much earlier and provide you with more context about what actually went wrong.

So yes, by all means: feel free to do that when it's appropriate. Just be aware that it's not appropriate very often, and the desire to do it might be a smell telling you to restructure your code.

answered Oct 22 '13 at 15:53

Kirk Strauser

30,189
5
49
65

I can see why it would be better to get the exception sooner than later, but I'm not following where it might come from. Assuming there wasn't a problem with my code constructing the dictionary from file data, can you explain when I could expect `in_dict = None`? Or is it just a matter of "best practice"? e.g. `stick_data` would be a dictionary containing the attributes needed for a stick, which is kept in `item_dictionary`, which is constructed from a file read. I don't have complete code to show for this, since I'm nailing down how I need that material organized (thus this question.) – eenblam Oct 22 '13 at 18:28
There are a million ways an object could end up empty or None, most involving retrieving user input. Maybe you're receiving JSON from a web request. Perhaps you're selecting from a database and got no results. Or you could be reading command line input and the user accidentally hit "return" too early. In any of those cases (and many more), you want to do input validation at the place you're receiving input, not in the core of your library code. – Kirk Strauser Oct 22 '13 at 20:13
Ha! I've been reading about JSON (particularly, retrieving nested dictionaries) since my previous comment (figuring out the save files.) Also new to me. Anyway, I take you to mean, "It's fine and dandy to use this in the very specific case where you can be certain that `None` (or some other inappropriate argument) won't be passed to the `__init__` methods." Is this roughly what you intended? – eenblam Oct 22 '13 at 20:29
1

"Just be aware that it's not appropriate very often, and the desire to do it might be a smell telling you to restructure your code." I think I see your point. Presently, I've only written maybe a dozen classes and have started thinking of the details in terms of complicated nested dictionaries that influence each other via the basic, associated class's methods. Really, I can cut down on this by getting the basics down in Player, and then refining them with more and more specific classes. Then, details would be defined by choice of class instead of providing tons of input. I feel silly. – eenblam Oct 22 '13 at 21:17
@Ben Yes, that's the way I meant it. I'd strongly prefer your second approach of having simple methods that are easier to reason about (and easier to test!), then changing their behavior with subclassing. – Kirk Strauser Oct 22 '13 at 23:31

score 1 · Accepted Answer · edited May 23 '17 at 10:25

I'm not sure I understand your question exactly, because I don't see how the code looked before you made the change to use in_dict. It sounds like you have been listing out dozens of keywords in the call to super (which is understandably not what you want), but this is not necessary. If your child class has a dict with all of this information, it can be turned into kwargs when you make the call with **in_dict. So:

class Actor:
    def __init__(self, **kwds):

class Item(Actor):
    def __init__(self, **kwds)
        self._everything = kwds
        super().__init__(**kwds)

I don't see a reason to add another dict for this, since you can just manipulate and pass the dict created for kwds anyway

Edit:

As for the question of the efficiency of using the ** expansion of the dict versus listing the arguments explicitly, I did a very unscientific timing test with this code:

import time

def some_func(**kwargs):
    for k,v in kwargs.items():
        pass

def main():
    name = 'felix'
    location = 'here'
    user_type = 'player'

    kwds = {'name': name,
            'location': location,
            'user_type': user_type}

    start = time.time()
    for i in range(10000000):
        some_func(**kwds)

    end = time.time()
    print 'Time using expansion:\t{0}s'.format(start - end)
    start = time.time()
    for i in range(10000000):
        some_func(name=name, location=location, user_type=user_type)

    end = time.time()
    print 'Time without expansion:\t{0}s'.format(start - end)


if __name__ == '__main__':
    main()

Running this 10,000,000 times gives a slight (and probably statistically meaningless) advantage passing around a dict and using **.

Time using expansion:   -7.9877269268s
Time without expansion: -8.06108212471s

If we print the IDs of the dict objects (kwds outside and kwargs inside the function), you will see that python creates a new dict for the function to use in either case, but in fact the function only gets one dict forever. After the initial definition of the function (where the kwargs dict is created) all subsequent calls are just updating the values of that dict belonging to the function, no matter how you call it. (See also this enlightening SO question about how mutable default parameters are handled in python, which is somewhat related)

So from a performance perspective, you can pick whichever makes sense to you. It should not meaningfully impact how python operates behind the scenes.

Thanks, that makes sense. Could you address my conclusion in the updated question? — eenblam, Oct 22 '13 at 18:48
Yes, I think your conclusion is accurate. Another benefit here is that `kwds` can contain things that the super class doesn't need, and that's ok. It only needs to extract the values that are important to it. But your conclusion that the main issue of using `in_dict` is redundant is true, as `kwds` will serve to reduce the argument list just the same. I am adding some timing information to my answer also that compares the two ways of calling a function with many named arguments — Carl, Oct 28 '13 at 15:54

loopbackbee · Answer 3 · 2013-10-22T16:09:46.133

This is not python specific, but the greatest problem I can see with passing arguments like this is that it breaks encapsulation. Any class may modify the arguments, and it's much more difficult to tell which arguments are expected in each class - making your code difficult to understand, and harder to debug.

Consider explicitly consuming the arguments in each class, and calling the super's __init__ on the remaining. You don't need to make them explicit:

class ClassA( object ):
    def __init__(self, arg1, arg2=""):
        pass

class ClassB( ClassA ):
    def __init__(self, arg3, arg4="", *args, **kwargs):
        ClassA.__init__(self, *args, **kwargs)


ClassB(3,4,1,2)

You can also leave the variables uninitialized and use methods to set them. You can then use different methods in the different classes, and all subclasses will have access to the superclass methods.

This was the original idea (prior to trying a dictionary) in that it required reading lots of individual data from a file and then passing each individually. I'm still trying to get a feel for the idea of encapsulation in the context of Python, since these attributes are all public. Presently, the `__init__` attributes are critical; things break if they aren't there. Does order count here? That is, would the (personal) convention of calling `super().__init__(**kwds)` in the very first line of the method not move the dictionary to the ancestor before anything else could modify an entry? — eenblam, Oct 22 '13 at 18:39

Is there a reason not to send super().init() a dictionary instead of **kwds?

3 Answers3

Linked

Related

Is there a reason not to send super().__init__() a dictionary instead of **kwds?

3 Answers3

Linked

Related

Is there a reason not to send super().init() a dictionary instead of **kwds?