How sys.getsizeof works underneath the hood?

Question

I have been checking the function sys.getsizeof, and I know that this returns the size in bytes of the parameter that is being passed.

I have some experience with C in which I can figure out the size of some values knowing the size of certain types. I have run some experiments with this function.

Note: I am using Python 3.7.3 on macOS to run the following:

For numbers

>>> sys.getsizeof(0)
24
>>> sys.getsizeof(1)
28
>>> sys.getsizeof(-1)
28
>>> sys.getsizeof(1.0)
24
>>> sys.getsizeof(-1.0)
24

For lists

>>> sys.getsizeof([])
64
>>> sys.getsizeof([1])
72
>>> sys.getsizeof([1.0])
72
>>> sys.getsizeof([0, 1])
80

For strings

>>> sys.getsizeof('d')
50
>>> sys.getsizeof('do')
51

For dictionaries

>>> sys.getsizeof({})
240
>>> sys.getsizeof({'a': 1})
240
>>> sys.getsizeof({'a': 1, 'b': 2})
240
>>> sys.getsizeof({'a': 1, 'b': 2, 'c': 3, 'd': 4})
240

I don't understand why the size of 0 is less than other integers. Even though I can figure out a pattern when adding more elements to a list or a string, but I don't understand why the size of the dictionary is the same no matter of the number key-value pairs it has.

Just a guess, but this may be due to: ***"Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to."*** - From https://docs.python.org/3.7/library/sys.html#sys.getsizeof — Pedro Lobito, Apr 02 '19 at 23:20
You may also read: https://stackoverflow.com/a/450034/797495 and the most voted comment : ***" Please add to the disclaimer that it will not hold true for nested objects or nested dicts or dicts in lists etc. – JohnnyM Aug 16 '15 at 9:22"*** — Pedro Lobito, Apr 02 '19 at 23:30
Possible duplicate of [How do I determine the size of an object in Python?](https://stackoverflow.com/questions/449560/how-do-i-determine-the-size-of-an-object-in-python) — Pedro Lobito, Apr 02 '19 at 23:31
@PedroLobito, I don't think it is a duplicate because I am not asking to determine the size of an element, but I am asking how this function behaves. I just want a further explanation for the sake of curiosity. — lmiguelvargasf, Apr 02 '19 at 23:36
@PedroLobito The excerpt you quoted explains why a list or dictionary containing a large string will have the same size as a list or dictionary containing a small string or an integer: because the size of the string or integer does not factor into the size of dict, as it will only contain an 8-byte (or 4-byte on 32-bit systems) per element. That does not address why a dictionary with two elements has the same size as one with only one element (see my answer for that). — sepp2k, Apr 03 '19 at 00:14

sepp2k · Accepted Answer · 2019-04-03T00:15:07.713

I don't understand why the size of 0 is less than other integers.

I'm assuming that integer objects store the number of ints needed to represent the integer, followed by that many ints. So 0 would be smaller than other numbers because it can be represented with 0 ints. Consequently the size would increase again once you get to numbers that don't fit into a single int.

I don't understand why the size of the dictionary is the same no matter of the number key-value pairs it has.

For dicts it's probably because the size of the array in a hash map (which Python's dicts are) is greater than the number of elements. Usually it starts as some default size and then gets doubled whenever a given threshold is reached (like when it's, say, 70% full). Once you get to a certain number of elements, you'll see that the size will increase.

You will observe similar behavior with lists if you create them by repeatedly appending to them rather than creating a list of a certain size to begin with. That is, if you start with an empty list and then append to it in a loop while printing the size after every append, you'll see that the size will only increase some times. That's because the underlying array won't be resized on each append, instead its size will be doubled whenever it's full, so the time between having to resize it will double after each resize (which gives appending amortized O(1) time instead of O(n)).

you are totally right about what is going on with dicts. I just made a test to verify you answer, and yeah great answer. Do you think you can provide some evidence for the first question? I don't understand so well why an int can be represented by no ints ("0 would be smaller than other numbers because it can be represented with 0 ints.") — lmiguelvargasf, Apr 03 '19 at 00:15
@lmiguelvargasf The "evidence" would be that the size does indeed get bigger once you're outside of the int range: `sys.getsizeof(2**32)` is 4 bytes bigger than `sys.getsizeof(42)`, which is the same as `sys.getsizeof(1)`. That strongly suggests that an `int` is added. But no, I don't have actual proof - as I said I was guessing. "why an int can be represented by no ints" If you represent an arbitrary-sized integer as an array of fixed-size integers, then its value would be the sum of the elements of the array times the base to the power of their index. That sum is 0 for the empty array. — sepp2k, Apr 03 '19 at 00:25
thank you! I really appreciate the time you took to answer my question. — lmiguelvargasf, Apr 03 '19 at 00:31

How sys.getsizeof works underneath the hood?

For numbers

For lists

For strings

For dictionaries

1 Answers1