How does the memory allocation work in python dictionaries?

Question

I want to understand how works the memory allocation in python when adding new data to a dictionary. In the code below, I was waiting that every new added data was stacked at the end, however it does not happen.

repetitions = {}
for item in new_deltas:
    list_aux = []
    if float(item[1]) <= 30:
        if float(item[0]) in repetitions:
            aux = repetitions[float(item[0])]
            aux.append(item[1])
            repetitions[float(item[0])] = aux
        else:
            list_aux.append(item[1])
            repetitions[float(item[0])] = list_aux
    print(repetitions)

The results I got are as below. Thus, I would like to understand why the new appended data is not added at the end of the stack, it is added in the middle of it.

My input data is:

new_deltas = [[1.452, 3.292182683944702], [1.449, 4.7438647747039795], [1.494, 6.192960977554321], [1.429, 7.686920166015625]]

The print line outputs:

{1.452: [3.292182683944702]}
{1.452: [3.292182683944702], 1.449: [4.7438647747039795]}
{1.452: [3.292182683944702], 1.494: [6.192960977554321], 1.449: [4.7438647747039795]}
{1.429: [7.686920166015625], 1.452: [3.292182683944702], 1.494: [6.192960977554321], 1.449: [4.7438647747039795]}

What is the order of `repititions.keys()`? - I cannot reproduce, I get what you are expecting - Python 3.6. — wwii, Jan 28 '20 at 23:44
I am using python 3.5.2. The order for repetitions.keys() is: dict_keys([1.429, 1.452, 1.494, 1.449]) — rafaoc, Jan 28 '20 at 23:58
Python dicts have traditionally not preserved insertion order; iterating over them produced an arbitrary ordering. This changed in 3.6, I think. — jasonharper, Jan 29 '20 at 00:00
@jasonharper It changed as an implementation detail of CPython in 3.6; as of 3.7, the language requires all implementations to do so. — chepner, Jan 29 '20 at 00:07

Raymond Hettinger · Answer 1 · 2020-01-29T00:37:20.240

Short answer

Dicts are implemented as hash tables rather than stacks.

Without additional measures that tends to scramble the order of keys

Hash Tables

Prior to Python 3.6, the ordering in a dictionary was randomized by a hash function. Roughly, here's how it worked:

d = {}        # Make a new dictionary
              # Internally 8 buckets are formed:
              #    [ [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] ]
d['a'] = 10   # hash('a') % s gives perhaps bucket 5:
              #    [ [ ] [ ] [ ] [ ] [ ] [('a', 10)] [ ] [ ] ]
d['b'] = 20   # hash('b') % s gives perhaps bucket 2:
              #    [ [ ] [ ] [('b', 20)] [ ] [ ] [('a', 10)] [ ] [ ] ]

So, you can see the ordering of this dict would put 'b' before 'a' because the hash function put 'b' in an earlier bucket.

Newer hash tables that remember insertion order

Starting in Python 3.6, there was a stack added as well. See this proof-of-concept for a better idea of how that works.

Accordingly, dicts started to remember insertion order and this behavior became guaranteed in Python 3.7 and later.

Use OrderedDict on older Python implementations

Prior to 3.7, you can use collections.OrderedDict() to get the same effect.

Deeper dive

For those interested in knowing more about how it works, I have a 37 minute video that shows from first principles all of the techniques used to make modern Python dictionaries.

rkersh · Accepted Answer · 2020-01-29T00:12:50.353

Prior to Python 3.6, dictionaries were not ordered (see this stackoverflow thread for more on that). If you are using Python 3.6 or lower (in CPython 3.6 the fact that order is maintained is an implementation detail, but with Python 3.7 it became a language feature), you can use the OrderedDict to get the behavior you want.

For example, you could change the beginning of your code snippet to the following:

from collections import OrderedDict
repetitions = OrderedDict()
...

How does the memory allocation work in python dictionaries?

2 Answers2