2

Both lists have 3 elements but why list size is different? Thanks!

l1.append(1)
l1.append(2)
l1.append(3)
print(l1.__sizeof__()) # the size is 72

l2 = [1,2,3]
print(l2.__sizeof__()) # the size is 64
Bernardo Duarte
  • 4,074
  • 4
  • 19
  • 34
q19260
  • 29
  • 3
  • 4
    probably because in the case of the literal, the underlying buffer is allocated exactly. When a python list is appended to, it will over-allocate sometimes to give amortized constant time append. It really doesn't matter, though. These are implementation details. – juanpa.arrivillaga Jan 10 '20 at 02:53

2 Answers2

5

l1.__sizeof__() was 72, but after appending another element it was still 72.

l1.append(4)
print(l1.__sizeof__()) # prints 72 again

So it seems appending may allocate too much space the first time, and will use up that space with extra appends.

l1 = []
print(l1.__sizeof__())  # prints 40
l1.append(1)
print(l1.__sizeof__())  # prints 72

4 elements - still prints 72
5 elements - prints 104

So, 72 - 40 = 32 (assume 40 bytes overhead for the list object)
32/4 = 8 (the non-overhead space for 4 integers)
8 bytes per element. Seems right for a 64-bit machine.

Applying that to the literally-defined list:

list with 3 elements of size 64.

64 - 40 = 24 # remove the fixed overhead size from the size of the list

24 / 8 = 3 # divide the remaining space by size of an integer

We get exactly 3 elements.

Yep, the literally defined list was allocated the exact amount of space it needed.

Brenda J. Butler
  • 1,475
  • 11
  • 20
4
import time

empty_list_size = [].__sizeof__()  # 40 bytes

lyst = []
while True:
    list_size = (lyst.__sizeof__() - empty_list_size) // 8  # 8 bytes per element
    print(len(lyst), list_size)
    lyst.append("foo")
    time.sleep(.05)  # don't go too fast

Will output

0 0
1 4
2 4
3 4
4 4
5 8
6 8
7 8
8 8
9 16
10 16
11 16
12 16
13 16
14 16
15 16
16 16
17 25

and so on:

yellow line is how many elements are in the list blue line is how many elements the list has memory for

When you append to a list that doesn't have enough memory to store the element, it needs to be re-sized. Resizing means you get a new, bigger chunk of memory and then have to copy the entire list to it, which is expensive (linear time complexity). If you just appended to a list, there's a good chance you are going to append again and we want to avoid resizing the list on every append. So instead of resizing it to be big enough to fit the element(s) you want to append, the list is resized by some percentage of the size it is already. This effectively makes appending a constant time operation because as the list becomes larger, you still have to iterate over the entire list to copy it from time to time, but it becomes rarer the larger the list gets.

Whereas declaring a list literal l = [1, 2, 3] allocates exactly the amount of memory needed to hold a list of that size.

See https://en.wikipedia.org/wiki/Dynamic_array#Geometric_expansion_and_amortized_cost

This is implemented in CPython in listobject.c

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
  • Nice graph, how did you do that @Boris? – Brenda J. Butler Jan 10 '20 at 04:09
  • 1
    once you have the x & y from his algorithm, you can create a line from that data and add another line with linear y~x. Python has `matplotlib` library for plots, I prefer `ggplot` of `r` though. Anyway his illustration is excellent, I has just learned this behaviour in python few weeks before. – Atreyagaurav Jan 10 '20 at 04:13
  • 1
    To see a similar graph for `dict`, check out [Dictionary size reduces upon increasing one element](https://stackoverflow.com/a/56382873/674039) – wim Jan 10 '20 at 20:01