22

The following seems strange.. Basically, the somedata attribute seems shared between all the classes that inherited from the_base_class.

class the_base_class:
    somedata = {}
    somedata['was_false_in_base'] = False


class subclassthing(the_base_class):
    def __init__(self):
            print self.somedata


first = subclassthing()
{'was_false_in_base': False}
first.somedata['was_false_in_base'] = True
second = subclassthing()
{'was_false_in_base': True}
>>> del first
>>> del second
>>> third = subclassthing()
{'was_false_in_base': True}

Defining self.somedata in the __init__ function is obviously the correct way to get around this (so each class has it's own somedata dict) - but when is such behavior desirable?

Don Kirkby
  • 53,582
  • 27
  • 205
  • 286
dbr
  • 165,801
  • 69
  • 278
  • 343

3 Answers3

24

You are right, somedata is shared between all instances of the class and it's subclasses, because it is created at class definition time. The lines

somedata = {}
somedata['was_false_in_base'] = False

are executed when the class is defined, i.e. when the interpreter encounters the class statement - not when the instance is created (think static initializer blocks in Java). If an attribute does not exist in a class instance, the class object is checked for the attribute.

At class definition time, you can run arbritrary code, like this:

 import sys
 class Test(object):
     if sys.platform == "linux2":
         def hello(self):
              print "Hello Linux"
     else:
         def hello(self):
              print "Hello ~Linux"

On a Linux system, Test().hello() will print Hello Linux, on all other systems the other string will be printed.

In constrast, objects in __init__ are created at instantiation time and belong to the instance only (when they are assigned to self):

class Test(object):
    def __init__(self):
        self.inst_var = [1, 2, 3]

Objects defined on a class object rather than instance can be useful in many cases. For instance, you might want to cache instances of your class, so that instances with the same member values can be shared (assuming they are supposed to be immutable):

class SomeClass(object):
    __instances__ = {}

    def __new__(cls, v1, v2, v3):
        try:
            return cls.__insts__[(v1, v2, v3)]
        except KeyError:
            return cls.__insts__.setdefault(
               (v1, v2, v3), 
               object.__new__(cls, v1, v2, v3))

Mostly, I use data in class bodies in conjunction with metaclasses or generic factory methods.

Torsten Marek
  • 83,780
  • 21
  • 91
  • 98
12

Note that part of the behavior you’re seeing is due to somedata being a dict, as opposed to a simple data type such as a bool.

For instance, see this different example which behaves differently (although very similar):

class the_base_class:
    somedata = False

class subclassthing(the_base_class):
    def __init__(self):
        print self.somedata


>>> first = subclassthing()
False
>>> first.somedata = True
>>> print first.somedata
True
>>> second = subclassthing()
False
>>> print first.somedata
True
>>> del first
>>> del second
>>> third = subclassthing()
False

The reason this example behaves differently from the one given in the question is because here first.somedata is being given a new value (the object True), whereas in the first example the dict object referenced by first.somedata (and also by the other subclass instances) is being modified.

See Torsten Marek’s comment to this answer for further clarification.

Geoff Reedy
  • 34,891
  • 3
  • 56
  • 79
TimB
  • 5,714
  • 2
  • 26
  • 30
  • Are you saying that simple data types are not shared while complex does? – OscarRyz Oct 15 '08 at 22:45
  • No, they are shared, every data type in Python is a reference type, even integers and booleans etc. The reason is that first.somedata does not contain the value False/True, it references the object False/True. If it is reassigned, it simply references a different object. – Torsten Marek Oct 15 '08 at 23:12
  • 1
    @Oscar, I suspect the difference is due to the nature of references. When somedata is a dictionary then first and second have their own reference but each reference points to the same dictionary in memory. When somedate is boolean they still get their own copies, but they modify the bool directly. – Jason Dagit Oct 15 '08 at 23:46
3

I think the easiest way to understand this (so that you can predict behavior) is to realize that your somedata is an attribute of the class and not the instance of that class if you define it that way.

There is really only one somedata at all times because in your example you didn't assign to that name but used it to look up a dict and then assign an item (key, value) to it. It's a gotcha that is a consequence of how the python interpreter works and can be confusing at first.

Toni Ruža
  • 7,462
  • 2
  • 28
  • 31