4

Today itself I ran the code given below in Python 2.7.13 and found that the list size isn't 0 when it is empty:

import sys
data = []
for k in range(n):
    a = len(data)
    b = sys.getsizeof(data)
    print('Length:{0:3d};Size in bytes:{1:4d}'.format(a,b))
    data.append(None)

Output on my Machine:

Length: 0; Size in bytes : 72
Length: 1; Size in bytes : 104
Length: 2; Size in bytes : 104
Length: 3; Size in bytes : 104
Length: 4; Size in bytes : 104
Length: 5; Size in bytes : 136
Length: 6; Size in bytes : 136
Length: 7; Size in bytes : 136
Length: 8; Size in bytes : 136
Length: 9; Size in bytes : 200
Length: 10; Size in bytes : 200
Length: 11; Size in bytes : 200
Length: 12; Size in bytes : 200
Length: 13; Size in bytes : 200
Length: 14; Size in bytes : 200
Length: 15; Size in bytes : 200
Length: 16; Size in bytes : 200
Length: 17; Size in bytes : 272
Length: 18; Size in bytes : 272
Length: 19; Size in bytes : 272

I want to know why is this happening?

It seems that Python is reserving memory for something. What is that something??

Aditya
  • 2,380
  • 2
  • 14
  • 39

2 Answers2

6

Because the size of the list, as returned from sys.getsizeof, doesn't include only the elements that list contains.

Every object in Python is represented by a C-struct; this struct contains pointers to all the things that make the list a list (its methods, mainly). It's also taken into consideration when sys.getsizeof is invoked.

You can always take a look at the implementation of list.__sizeof__ in the master branch of the CPython repository on GitHub:

static PyObject *
list___sizeof___impl(PyListObject *self)
{
    Py_ssize_t res;

    res = _PyObject_SIZE(Py_TYPE(self)) + self->allocated * sizeof(void*);
    return PyLong_FromSsize_t(res);
}

(Trimmed off unrelated arg clinic output.)

The sizeof function for 2.x does the same thing.

The return value res also includes the size of the list object type _PyObject_SIZE(Py_Type(self)).

Since everything in Python is an object, this behavior can be observed everywhere, e.x, integer 0:

>>> getsizeof(0)
24

while you wouldn't normally expect this, it makes perfect sense when you realize everything in Python has "additional baggage" which allows behavior we take for granted.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
2

Python is implemented in C, and as such will be storing data in a C struct.

Remember that all things are 'objects' - objects must have a type and an object size, even if they don't store anything.

Below are the PyObject_VAR_HEAD and PyListObject C datatypes.

#define PyObject_VAR_HEAD               \
    PyObject_HEAD                       \
    Py_ssize_t ob_size; /* Number of items in variable part */

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size
     *     ob_item == NULL implies ob_size == allocated == 0
     * list.sort() temporarily sets allocated to -1 to detect mutations.
     *
     * Items must normally not be NULL, except during construction when
     * the list is not yet visible outside the function that builds it.
     */
    Py_ssize_t allocated;
} PyListObject;

Remember that sys.getsizeof() will return the underlying memory usage, not something that you really need to consider or worry about from Python:

Return the size of an object in bytes.

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

Additionally, as your test shows, there is an amount of pre-allocation going on. New memory is not associated with the list on every call to append().

Community
  • 1
  • 1
Attie
  • 6,690
  • 2
  • 24
  • 34