4

I wonder if there is any difference in performance when accessing a class variable (a dict) inside a method of the same class using:

self.class_variable_dict.add(some_key, some_value)

and

ClassName.class_variable_dict.add(some_key, some_value)

obviously, both will work as long as there is no instance variable with the same name, but is there any reason/use case for which we should prefer one over the other?

senderle
  • 145,869
  • 36
  • 209
  • 233
MLister
  • 10,022
  • 18
  • 64
  • 92

3 Answers3

6

Accessing it via ClassName rather than via self will be slightly faster, since if you access it via self it must first check the instance namespace. But I don't expect the difference to be at all significant, unless you have profiling information to suggest that it is.

So I would recommend using whichever one you think is easier to read/understand as a human.

Semantically, they will be different only if the class_variable_dict variable gets shadowed somewhere -- in particular, if (a) self defines a variable of the same name; or (b) self is an instance of a subclass of ClassName, and that subclass (or one of its bases that's still a subclass of ClassName) defines a variable of the same name. If neither of those is true, then they should be semantically identical.

Edit:

delnam has a good point: there are factors that might make either faster. I stand by my assertion that the difference will be trivial unless it's in a very very tight loop. To test it, I created the tightest loop I could think of, and timed it with timeit. Here are the results:

  • access via class var: 20.226 seconds
  • access via inst var: 23.121 seconds

Based on several runs, it looks like the error bars are about 1sec -- i.e., this is a statistically significant difference, but probably not worth worrying about. Here's my test code:

import timeit

setup='''
class A:
    var = {}
    def f1(self):
        x = A.var
    def f2(self):
        x = self.var

a = A()
'''
print 'access via class var: %.3f' % timeit.timeit('a.f1()', setup=setup, number=100000000)
print 'access via inst var: %.3f' % timeit.timeit('a.f2()', setup=setup, number=100000000)
Edward Loper
  • 15,374
  • 7
  • 43
  • 52
  • 1
    OTOH, looking up `self` will be faster, as it's a local and hence accessed from a C array, whereas `ClassName` is a global and has to be looked up in a dict. There's no way to know which one is faster, except profiling! –  May 22 '12 at 17:19
3

Let's look at what the different options all do.

In [1]: class Foo:
   ...:     bar = {}
   ...:     

In [2]: import dis
In [3]: dis.dis(lambda: Foo.bar.add(1,2))
  1           0 LOAD_GLOBAL              0 (Foo) 
              3 LOAD_ATTR                1 (bar) 
              6 LOAD_ATTR                2 (add) 
              9 LOAD_CONST               1 (1) 
             12 LOAD_CONST               2 (2) 
             15 CALL_FUNCTION            2 
             18 RETURN_VALUE         

In [4]: dis.dis(lambda: Foo().bar.add(1,2))
  1           0 LOAD_GLOBAL              0 (Foo) 
              3 CALL_FUNCTION            0 
              6 LOAD_ATTR                1 (bar) 
              9 LOAD_ATTR                2 (add) 
             12 LOAD_CONST               1 (1) 
             15 LOAD_CONST               2 (2) 
             18 CALL_FUNCTION            2 
             21 RETURN_VALUE 

As you can see from that, both styles generate the same bytecode aside from creating the object in the second case.


The other facet to this is that it doesn't matter. Use whatever expresses your objective in the most precise way. Only optimize if the speed matters.

I'd recommend just going with ClassName.dict_value[key] = value

Daenyth
  • 35,856
  • 13
  • 85
  • 124
  • 1
    If you are really concerned with speed then obviously you do not put this inside a class at all, but have a module global _classname_dict_value, there is no added benefit except that someone can access the dict_value outside of your module scope. – Antti Haapala -- Слава Україні May 22 '12 at 14:14
  • 1
    @senderle: The bytecode is only different insofar as it's instantiating an object. – Daenyth May 22 '12 at 14:16
  • 1
    OK, Thanks for addressing that. But I still think it's important to note that identical bytecode doesn't necessarily result in identical performance. In this case, LOAD_ATTR is one bytecode instruction, but it can be faster or slower depending on circumstances -- in rather the way the same set of cpu instructions may be faster or slower depending on whether there is a cache miss. – senderle May 22 '12 at 15:08
0

Instance variables vs. class variables in Python

See also here same question where I posted a broader test based upon @Edward Loper test including module variables as well which are even faster than class variables.

robertcollier4
  • 742
  • 10
  • 10