148

What is more efficient in Python in terms of memory usage and CPU consumption - Dictionary or Object?

Background: I have to load huge amount of data into Python. I created an object that is just a field container. Creating 4M instances and putting them into a dictionary took about 10 minutes and ~6GB of memory. After dictionary is ready, accessing it is a blink of an eye.

Example: To check the performance I wrote two simple programs that do the same - one is using objects, other dictionary:

Object (execution time ~18sec):

class Obj(object):
  def __init__(self, i):
    self.i = i
    self.l = []
all = {}
for i in range(1000000):
  all[i] = Obj(i)

Dictionary (execution time ~12sec):

all = {}
for i in range(1000000):
  o = {}
  o['i'] = i
  o['l'] = []
  all[i] = o

Question: Am I doing something wrong or dictionary is just faster than object? If indeed dictionary performs better, can somebody explain why?

tkokoszka
  • 11,647
  • 10
  • 30
  • 25

8 Answers8

180

Have you tried using __slots__?

From the documentation:

By default, instances of both old and new-style classes have a dictionary for attribute storage. This wastes space for objects having very few instance variables. The space consumption can become acute when creating large numbers of instances.

The default can be overridden by defining __slots__ in a new-style class definition. The __slots__ declaration takes a sequence of instance variables and reserves just enough space in each instance to hold a value for each variable. Space is saved because __dict__ is not created for each instance.

So does this save time as well as memory?

Comparing the three approaches on my computer:

test_slots.py:

class Obj(object):
  __slots__ = ('i', 'l')
  def __init__(self, i):
    self.i = i
    self.l = []
all = {}
for i in range(1000000):
  all[i] = Obj(i)

test_obj.py:

class Obj(object):
  def __init__(self, i):
    self.i = i
    self.l = []
all = {}
for i in range(1000000):
  all[i] = Obj(i)

test_dict.py:

all = {}
for i in range(1000000):
  o = {}
  o['i'] = i
  o['l'] = []
  all[i] = o

test_namedtuple.py (supported in 2.6):

import collections

Obj = collections.namedtuple('Obj', 'i l')

all = {}
for i in range(1000000):
  all[i] = Obj(i, [])

Run benchmark (using CPython 2.5):

$ lshw | grep product | head -n 1
          product: Intel(R) Pentium(R) M processor 1.60GHz
$ python --version
Python 2.5
$ time python test_obj.py && time python test_dict.py && time python test_slots.py 

real    0m27.398s (using 'normal' object)
real    0m16.747s (using __dict__)
real    0m11.777s (using __slots__)

Using CPython 2.6.2, including the named tuple test:

$ python --version
Python 2.6.2
$ time python test_obj.py && time python test_dict.py && time python test_slots.py && time python test_namedtuple.py 

real    0m27.197s (using 'normal' object)
real    0m17.657s (using __dict__)
real    0m12.249s (using __slots__)
real    0m12.262s (using namedtuple)

So yes (not really a surprise), using __slots__ is a performance optimization. Using a named tuple has similar performance to __slots__.

Community
  • 1
  • 1
codeape
  • 97,830
  • 24
  • 159
  • 188
  • 4
    That is great - thanks! I've tried the same on my machine - object with __slots__ is the most efficient approach (I got ~7sec). – tkokoszka Aug 26 '09 at 19:43
  • 6
    There are also named tuples, http://docs.python.org/library/collections.html#collections.namedtuple , a class factory for objects with slots. It definitely neater and maybe even more optimized. – Jochen Ritzel Aug 26 '09 at 20:19
  • 1
    I tested named tuples as well, and updated the answer with the results. – codeape Aug 27 '09 at 08:05
  • 1
    I ran your code a few times and was surprised my results differ - slots=3sec obj=11sec dict=12sec namedtuple=16sec. I'm using CPython 2.6.6 on Win7 64bit – Jonathan Livni Jul 05 '11 at 13:24
  • To emphasize the punchline - **namedtuple** got the *worst* results instead of the *best* – Jonathan Livni Jul 06 '11 at 16:59
  • Strange. I'll try the benchmark on Windows when I get the chance. – codeape Jul 08 '11 at 20:06
  • 1
    Why is using a normal object so much slower than using \_\_dict\_\_ if, under the hood, the normal object is using dictionaries? – Detached Laconian Oct 12 '20 at 15:04
  • Hi. I tried the benchmark proposed here and the best time was obtained using a plain dictionary (Wall time: 1.98 s). I did the test in a jupyter notebook using `%%time` and `%%timeit`. Python version 3.8.5 and OS Windows 8.1. Is this "normal"? I thought that the `__slots__` test (Wall time: 8.8 s) would outperform the other tests. – Pedro Pablo Severin Honorato Feb 25 '21 at 17:44
17

Attribute access in an object uses dictionary access behind the scenes - so by using attribute access you are adding extra overhead. Plus in the object case, you are incurring additional overhead because of e.g. additional memory allocations and code execution (e.g. of the __init__ method).

In your code, if o is an Obj instance, o.attr is equivalent to o.__dict__['attr'] with a small amount of extra overhead.

Vinay Sajip
  • 95,872
  • 14
  • 179
  • 191
  • Did you test this? `o.__dict__["attr"]` is the one with extra overhead, taking an extra bytecode op; obj.attr is faster. (Of course attribute access isn't going to be slower than subscription access--it's a critical, heavily optimized code path.) – Glenn Maynard Aug 26 '09 at 19:29
  • 2
    Obviously if you actually **do** o.__dict__["attr"] it will be slower - I only meant to say that it was equivalent to that, not that it was implemented exactly in that way. I guess it's not clear from my wording. I also mentioned other factors such as memory allocations, constructor call time etc. – Vinay Sajip Aug 26 '09 at 19:34
  • 2
    Is this still the case with recent versions of python3, 11 years later? – matanster May 22 '20 at 19:41
11

Have you considered using a namedtuple? (link for python 2.4/2.5)

It's the new standard way of representing structured data that gives you the performance of a tuple and the convenience of a class.

It's only downside compared with dictionaries is that (like tuples) it doesn't give you the ability to change attributes after creation.

John Fouhy
  • 41,203
  • 19
  • 62
  • 77
9

Here is a copy of @hughdbrown answer for python 3.6.1, I've made the count 5x larger and added some code to test the memory footprint of the python process at the end of each run.

Before the downvoters have at it, Be advised that this method of counting the size of objects is not accurate.

from datetime import datetime
import os
import psutil

process = psutil.Process(os.getpid())


ITER_COUNT = 1000 * 1000 * 5

RESULT=None

def makeL(i):
    # Use this line to negate the effect of the strings on the test 
    # return "Python is smart and will only create one string with this line"

    # Use this if you want to see the difference with 5 million unique strings
    return "This is a sample string %s" % i

def timeit(method):
    def timed(*args, **kw):
        global RESULT
        s = datetime.now()
        RESULT = method(*args, **kw)
        e = datetime.now()

        sizeMb = process.memory_info().rss / 1024 / 1024
        sizeMbStr = "{0:,}".format(round(sizeMb, 2))

        print('Time Taken = %s, \t%s, \tSize = %s' % (e - s, method.__name__, sizeMbStr))

    return timed

class Obj(object):
    def __init__(self, i):
       self.i = i
       self.l = makeL(i)

class SlotObj(object):
    __slots__ = ('i', 'l')
    def __init__(self, i):
       self.i = i
       self.l = makeL(i)

from collections import namedtuple
NT = namedtuple("NT", ["i", 'l'])

@timeit
def profile_dict_of_nt():
    return [NT(i=i, l=makeL(i)) for i in range(ITER_COUNT)]

@timeit
def profile_list_of_nt():
    return dict((i, NT(i=i, l=makeL(i))) for i in range(ITER_COUNT))

@timeit
def profile_dict_of_dict():
    return dict((i, {'i': i, 'l': makeL(i)}) for i in range(ITER_COUNT))

@timeit
def profile_list_of_dict():
    return [{'i': i, 'l': makeL(i)} for i in range(ITER_COUNT)]

@timeit
def profile_dict_of_obj():
    return dict((i, Obj(i)) for i in range(ITER_COUNT))

@timeit
def profile_list_of_obj():
    return [Obj(i) for i in range(ITER_COUNT)]

@timeit
def profile_dict_of_slot():
    return dict((i, SlotObj(i)) for i in range(ITER_COUNT))

@timeit
def profile_list_of_slot():
    return [SlotObj(i) for i in range(ITER_COUNT)]

profile_dict_of_nt()
profile_list_of_nt()
profile_dict_of_dict()
profile_list_of_dict()
profile_dict_of_obj()
profile_list_of_obj()
profile_dict_of_slot()
profile_list_of_slot()

And these are my results

Time Taken = 0:00:07.018720,    provile_dict_of_nt,     Size = 951.83
Time Taken = 0:00:07.716197,    provile_list_of_nt,     Size = 1,084.75
Time Taken = 0:00:03.237139,    profile_dict_of_dict,   Size = 1,926.29
Time Taken = 0:00:02.770469,    profile_list_of_dict,   Size = 1,778.58
Time Taken = 0:00:07.961045,    profile_dict_of_obj,    Size = 1,537.64
Time Taken = 0:00:05.899573,    profile_list_of_obj,    Size = 1,458.05
Time Taken = 0:00:06.567684,    profile_dict_of_slot,   Size = 1,035.65
Time Taken = 0:00:04.925101,    profile_list_of_slot,   Size = 887.49

My conclusion is:

  1. Slots have the best memory footprint and are reasonable on speed.
  2. dicts are the fastest, but use the most memory.
RandomEli
  • 1,527
  • 5
  • 30
  • 53
Jarrod Chesney
  • 709
  • 7
  • 5
  • Man, you should turn this into a question. I ran it on my own computer, too, just to make sure (I didn't have psutil installed, so I took that part out). Anyway, this is baffling to me, and means the original question is not fully answered. All the other answers are like "namedtuple is great" and "use __slots__", and apparently a brand new dict object every time is faster than them? I guess dicts are really well optimised? – Multihunter Oct 12 '17 at 07:01
  • 1
    It seems to be the result of the makeL function returning a string. If you return an empty list, instead, the results roughly match up with hughdbrown's from python2. Except namedtuples are always slower than SlotObj :( – Multihunter Oct 12 '17 at 07:14
  • There could be a small problem: makeL could run with different speeds in each '@timeit' round since strings are cached in python - but maybe I'm wrong. – Barney Szabolcs Aug 26 '19 at 02:12
  • @BarnabasSzabolcs should create a new string every time because it has to substitute in the value "This is a sample string %s" % i – Jarrod Chesney Aug 27 '19 at 03:06
  • Yes, that's true within the loop, but in the second test i starts from 0 again. – Barney Szabolcs Aug 27 '19 at 12:59
4
from datetime import datetime

ITER_COUNT = 1000 * 1000

def timeit(method):
    def timed(*args, **kw):
        s = datetime.now()
        result = method(*args, **kw)
        e = datetime.now()

        print method.__name__, '(%r, %r)' % (args, kw), e - s
        return result
    return timed

class Obj(object):
    def __init__(self, i):
       self.i = i
       self.l = []

class SlotObj(object):
    __slots__ = ('i', 'l')
    def __init__(self, i):
       self.i = i
       self.l = []

@timeit
def profile_dict_of_dict():
    return dict((i, {'i': i, 'l': []}) for i in xrange(ITER_COUNT))

@timeit
def profile_list_of_dict():
    return [{'i': i, 'l': []} for i in xrange(ITER_COUNT)]

@timeit
def profile_dict_of_obj():
    return dict((i, Obj(i)) for i in xrange(ITER_COUNT))

@timeit
def profile_list_of_obj():
    return [Obj(i) for i in xrange(ITER_COUNT)]

@timeit
def profile_dict_of_slotobj():
    return dict((i, SlotObj(i)) for i in xrange(ITER_COUNT))

@timeit
def profile_list_of_slotobj():
    return [SlotObj(i) for i in xrange(ITER_COUNT)]

if __name__ == '__main__':
    profile_dict_of_dict()
    profile_list_of_dict()
    profile_dict_of_obj()
    profile_list_of_obj()
    profile_dict_of_slotobj()
    profile_list_of_slotobj()

Results:

hbrown@hbrown-lpt:~$ python ~/Dropbox/src/StackOverflow/1336791.py 
profile_dict_of_dict ((), {}) 0:00:08.228094
profile_list_of_dict ((), {}) 0:00:06.040870
profile_dict_of_obj ((), {}) 0:00:11.481681
profile_list_of_obj ((), {}) 0:00:10.893125
profile_dict_of_slotobj ((), {}) 0:00:06.381897
profile_list_of_slotobj ((), {}) 0:00:05.860749
hughdbrown
  • 47,733
  • 20
  • 85
  • 108
3

There is no question.
You have data, with no other attributes (no methods, nothing). Hence you have a data container (in this case, a dictionary).

I usually prefer to think in terms of data modeling. If there is some huge performance issue, then I can give up something in the abstraction, but only with very good reasons.
Programming is all about managing complexity, and the maintaining the correct abstraction is very often one of the most useful way to achieve such result.

About the reasons an object is slower, I think your measurement is not correct.
You are performing too little assignments inside the for loop, and therefore what you see there is the different time necessary to instantiate a dict (intrinsic object) and a "custom" object. Although from the language perspective they are the same, they have quite a different implementation.
After that, the assignment time should be almost the same for both, as in the end members are maintained inside a dictionary.

rob
  • 36,896
  • 2
  • 55
  • 65
0

There is yet another way with the help of recordclass library to reduce memory usage if data structure isn't supposed to contain reference cycles.

Let's compare two classes:

class DataItem:
    __slots__ = ('name', 'age', 'address')
    def __init__(self, name, age, address):
        self.name = name
        self.age = age
        self.address = address

and

$ pip install recordclass

>>> from recordclass import make_dataclass
>>> DataItem2 = make_dataclass('DataItem', 'name age address')
>>> inst = DataItem('Mike', 10, 'Cherry Street 15')
>>> inst2 = DataItem2('Mike', 10, 'Cherry Street 15')
>>> print(inst2)
DataItem(name='Mike', age=10, address='Cherry Street 15')
>>> print(sys.getsizeof(inst), sys.getsizeof(inst2))
64 40

It became possible since dataobject-based subclasses doesn't support cyclic garbage collection, which is not needed in such cases.

intellimath
  • 2,396
  • 1
  • 12
  • 13
0

Here are my test runs of the very nice script of @Jarrod-Chesney. For comparison, I also run it against python2 with "range" replaced by "xrange".

By curiosity, I also added similar tests with OrderedDict (ordict) for comparison.

Python 3.6.9:

Time Taken = 0:00:04.971369,    profile_dict_of_nt,     Size = 944.27
Time Taken = 0:00:05.743104,    profile_list_of_nt,     Size = 1,066.93
Time Taken = 0:00:02.524507,    profile_dict_of_dict,   Size = 1,920.35
Time Taken = 0:00:02.123801,    profile_list_of_dict,   Size = 1,760.9
Time Taken = 0:00:05.374294,    profile_dict_of_obj,    Size = 1,532.12
Time Taken = 0:00:04.517245,    profile_list_of_obj,    Size = 1,441.04
Time Taken = 0:00:04.590298,    profile_dict_of_slot,   Size = 1,030.09
Time Taken = 0:00:04.197425,    profile_list_of_slot,   Size = 870.67

Time Taken = 0:00:08.833653,    profile_ordict_of_ordict, Size = 3,045.52
Time Taken = 0:00:11.539006,    profile_list_of_ordict, Size = 2,722.34
Time Taken = 0:00:06.428105,    profile_ordict_of_obj,  Size = 1,799.29
Time Taken = 0:00:05.559248,    profile_ordict_of_slot, Size = 1,257.75

Python 2.7.15+:

Time Taken = 0:00:05.193900,    profile_dict_of_nt,     Size = 906.0
Time Taken = 0:00:05.860978,    profile_list_of_nt,     Size = 1,177.0
Time Taken = 0:00:02.370905,    profile_dict_of_dict,   Size = 2,228.0
Time Taken = 0:00:02.100117,    profile_list_of_dict,   Size = 2,036.0
Time Taken = 0:00:08.353666,    profile_dict_of_obj,    Size = 2,493.0
Time Taken = 0:00:07.441747,    profile_list_of_obj,    Size = 2,337.0
Time Taken = 0:00:06.118018,    profile_dict_of_slot,   Size = 1,117.0
Time Taken = 0:00:04.654888,    profile_list_of_slot,   Size = 964.0

Time Taken = 0:00:59.576874,    profile_ordict_of_ordict, Size = 7,427.0
Time Taken = 0:10:25.679784,    profile_list_of_ordict, Size = 11,305.0
Time Taken = 0:05:47.289230,    profile_ordict_of_obj,  Size = 11,477.0
Time Taken = 0:00:51.485756,    profile_ordict_of_slot, Size = 11,193.0

So, on both major versions, the conclusions of @Jarrod-Chesney are still looking good.

Florent V
  • 191
  • 1
  • 7