142

Is there a reference for the memory size of Python data stucture on 32- and 64-bit platforms?

If not, this would be nice to have it on SO. The more exhaustive the better! So how many bytes are used by the following Python structures (depending on the len and the content type when relevant)?

  • int
  • float
  • reference
  • str
  • unicode string
  • tuple
  • list
  • dict
  • set
  • array.array
  • numpy.array
  • deque
  • new-style classes object
  • old-style classes object
  • ... and everything I am forgetting!

(For containers that keep only references to other objects, we obviously do not want to count the size of the item themselves, since it might be shared.)

Furthermore, is there a way to get the memory used by an object at runtime (recursively or not)?

Michael Currie
  • 13,721
  • 9
  • 42
  • 58
LeMiz
  • 5,554
  • 5
  • 28
  • 23
  • A lot of helpful explanations may be found here http://stackoverflow.com/questions/1059674/python-memory-model. I would like to see a more systematic overview, though – LeMiz Aug 25 '09 at 23:13
  • 3
    For a NumPy array `a`, use `a.nbytes`. – Will May 01 '14 at 20:54
  • If you are interested in a graphical view of this, I made a plot of it once: http://stackoverflow.com/a/30008338/2087463 – tmthydvnprt Mar 13 '16 at 13:36

7 Answers7

167

The recommendation from an earlier question on this was to use sys.getsizeof(), quoting:

>>> import sys
>>> x = 2
>>> sys.getsizeof(x)
14
>>> sys.getsizeof(sys.getsizeof)
32
>>> sys.getsizeof('this')
38
>>> sys.getsizeof('this also')
48

You could take this approach:

>>> import sys
>>> import decimal
>>> 
>>> d = {
...     "int": 0,
...     "float": 0.0,
...     "dict": dict(),
...     "set": set(),
...     "tuple": tuple(),
...     "list": list(),
...     "str": "a",
...     "unicode": u"a",
...     "decimal": decimal.Decimal(0),
...     "object": object(),
... }
>>> for k, v in sorted(d.iteritems()):
...     print k, sys.getsizeof(v)
...
decimal 40
dict 140
float 16
int 12
list 36
object 8
set 116
str 25
tuple 28
unicode 28

2012-09-30

python 2.7 (linux, 32-bit):

decimal 36
dict 136
float 16
int 12
list 32
object 8
set 112
str 22
tuple 24
unicode 32

python 3.3 (linux, 32-bit)

decimal 52
dict 144
float 16
int 14
list 32
object 8
set 112
str 26
tuple 24
unicode 26

2016-08-01

OSX, Python 2.7.10 (default, Oct 23 2015, 19:19:21) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin

decimal 80
dict 280
float 24
int 24
list 72
object 16
set 232
str 38
tuple 56
unicode 52
Community
  • 1
  • 1
hughdbrown
  • 47,733
  • 20
  • 85
  • 108
  • 1
    Thanks, and sorry for the dupe for the second question... too bad I am using 2.5 and not 2.6... – LeMiz Aug 25 '09 at 23:16
  • I forgot I had a virtual box with a recent ubuntu on it! That's strange, sys.getsizeof(dict) is 136 for me (python 2.6 running on a kubuntu vm, hosted by OS X, so I am not sure of anything) – LeMiz Aug 25 '09 at 23:39
  • @LeMiz: For me (Python 2.6, Windows XP SP3), sys.getsizeof(dict) -> 436; sys.getsizeof(dict()) -> 140 – John Machin Aug 26 '09 at 10:58
  • LeMiz-Kubuntu:python2.6 Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.getsizeof(dict) 436 >>> sys.getsizeof(dict()) 136 – LeMiz Aug 26 '09 at 15:43
  • 1
    shouldn't the values be `0`, `0.0`, `''` and `u''` for consistency? – SilentGhost Aug 26 '09 at 16:38
  • Unfortunately, this does not work for NumPy arrays (they return 44 bytes whatever their size, with version 1.5). – Eric O. Lebigot Feb 05 '10 at 13:34
  • what about making a tuple of constants, and using type(): for item in tuple(0, 0.0, u'', etc.): print type(item), sys.getsizeof(item) – rbp Mar 20 '14 at 15:56
  • decimal? What's that? – Zizouz212 Jun 07 '15 at 03:22
  • I suppose size is in bytes, but why it's not written anywhere, even in http://pythonhosted.org/Pympler – Zhomart Jan 21 '16 at 03:48
  • @Zhomart: Seems pretty clear to me. *sys.getsizeof(object[, default])* "Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific." https://docs.python.org/3/library/sys.html – hughdbrown Jan 22 '16 at 04:15
  • Yeah, I wrote comment to different answer. I wanted to complain about Pympler. Thanks for pointing out @hughdbrown. – Zhomart Jan 26 '16 at 01:08
58

These answers all collect shallow size information. I suspect that visitors to this question will end up here looking to answer the question, "How big is this complex object in memory?"

There's a great answer here: https://goshippo.com/blog/measure-real-size-any-python-object/

The punchline:

import sys

def get_size(obj, seen=None):
    """Recursively finds size of objects"""
    size = sys.getsizeof(obj)
    if seen is None:
        seen = set()
    obj_id = id(obj)
    if obj_id in seen:
        return 0
    # Important mark as seen *before* entering recursion to gracefully handle
    # self-referential objects
    seen.add(obj_id)
    if isinstance(obj, dict):
        size += sum([get_size(v, seen) for v in obj.values()])
        size += sum([get_size(k, seen) for k in obj.keys()])
    elif hasattr(obj, '__dict__'):
        size += get_size(obj.__dict__, seen)
    elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
        size += sum([get_size(i, seen) for i in obj])
    return size

Used like so:

In [1]: get_size(1)
Out[1]: 24

In [2]: get_size([1])
Out[2]: 104

In [3]: get_size([[1]])
Out[3]: 184

If you want to know Python's memory model more deeply, there's a great article here that has a similar "total size" snippet of code as part of a longer explanation: https://code.tutsplus.com/tutorials/understand-how-much-memory-your-python-objects-use--cms-25609

Kobold
  • 1,708
  • 15
  • 17
39

I've been happily using pympler for such tasks. It's compatible with many versions of Python -- the asizeof module in particular goes back to 2.2!

For example, using hughdbrown's example but with from pympler import asizeof at the start and print asizeof.asizeof(v) at the end, I see (system Python 2.5 on MacOSX 10.5):

$ python pymp.py 
set 120
unicode 32
tuple 32
int 16
decimal 152
float 16
list 40
object 0
dict 144
str 32

Clearly there is some approximation here, but I've found it very useful for footprint analysis and tuning.

Joël Brigate
  • 237
  • 2
  • 13
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 1
    Some curiosities: most of you numbers are 4 higher; object is 0; and decimal is about 4 times larger by your estimate. – hughdbrown Aug 26 '09 at 02:26
  • 1
    Yep. The "4 higher" actually mostly look like "rounding up to a multiple of 8" which I believe is correct for the way malloc behaves here. No idea why decimal gets so distorted (with pympler on 2.6, too). – Alex Martelli Aug 26 '09 at 02:40
  • 2
    Actually, you should use pympler.asizeof.flatsize() to get a similar functionality to sys.getsizeof(). There is also an align= parameter you can use (which is 8 by default as Alex pointed out). – Pankrat Sep 11 '09 at 19:41
  • @AlexMartelli Hi Alex! .. Why minimum size of a char in python is 25 bytes. `>>> getsizeof('a')` gives `25` and `>>> getsizeof('ab')` gives `26` ` – Grijesh Chauhan Jan 17 '13 at 05:13
  • 1
    I suppose size is in bytes, but why it's not written anywhere, even in pythonhosted.org/Pympler – Zhomart Jan 26 '16 at 01:08
10

Try memory profiler. memory profiler

Line #    Mem usage  Increment   Line Contents
==============================================
     3                           @profile
     4      5.97 MB    0.00 MB   def my_func():
     5     13.61 MB    7.64 MB       a = [1] * (10 ** 6)
     6    166.20 MB  152.59 MB       b = [2] * (2 * 10 ** 7)
     7     13.61 MB -152.59 MB       del b
     8     13.61 MB    0.00 MB       return a
Tampa
  • 75,446
  • 119
  • 278
  • 425
  • 1
    Precision seems to be 1/100MB, or 10.24 bytes. This is fine for macro-analysis, but I doubt that such precision would lead to an accurate comparison of the data structures as asked in the question. – Zoran Pavlovic Nov 30 '14 at 18:49
7

Also you can use guppy module.

>>> from guppy import hpy; hp=hpy()
>>> hp.heap()
Partition of a set of 25853 objects. Total size = 3320992 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  11731  45   929072  28    929072  28 str
     1   5832  23   469760  14   1398832  42 tuple
     2    324   1   277728   8   1676560  50 dict (no owner)
     3     70   0   216976   7   1893536  57 dict of module
     4    199   1   210856   6   2104392  63 dict of type
     5   1627   6   208256   6   2312648  70 types.CodeType
     6   1592   6   191040   6   2503688  75 function
     7    199   1   177008   5   2680696  81 type
     8    124   0   135328   4   2816024  85 dict of class
     9   1045   4    83600   3   2899624  87 __builtin__.wrapper_descriptor
<90 more rows. Type e.g. '_.more' to view.>

And:

>>> hp.iso(1, [1], "1", (1,), {1:1}, None)
Partition of a set of 6 objects. Total size = 560 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1  17      280  50       280  50 dict (no owner)
     1      1  17      136  24       416  74 list
     2      1  17       64  11       480  86 tuple
     3      1  17       40   7       520  93 str
     4      1  17       24   4       544  97 int
     5      1  17       16   3       560 100 types.NoneType
Omid Raha
  • 9,862
  • 1
  • 60
  • 64
1

When you use the dir([object]) built-in function, you can get the __sizeof__ of the built-in function.

>>> a = -1
>>> a.__sizeof__()
24
B--rian
  • 5,578
  • 10
  • 38
  • 89
hello_god
  • 173
  • 10
0

One can also make use of the tracemalloc module from the Python standard library. It seems to work well for objects whose class is implemented in C (unlike Pympler, for instance).

zahypeti
  • 183
  • 1
  • 8