736

Dictionaries are insertion ordered as of Python 3.6. It is described as a CPython implementation detail rather than a language feature. The documentation states:

dict() now uses a “compact” representation pioneered by PyPy. The memory usage of the new dict() is between 20% and 25% smaller compared to Python 3.5. PEP 468 (Preserving the order of **kwargs in a function.) is implemented by this. The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon (this may change in the future, but it is desired to have this new dict implementation in the language for a few releases before changing the language spec to mandate order-preserving semantics for all current and future Python implementations; this also helps preserve backwards-compatibility with older versions of the language where random iteration order is still in effect, e.g. Python 3.5). (Contributed by INADA Naoki in issue 27350. Idea originally suggested by Raymond Hettinger.)

How does the new dictionary implementation perform better than the older one while preserving element order?


Update December 2017: dicts retaining insertion order is guaranteed for Python 3.7

Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
  • 5
    See this thread on Python-Dev mailing-list : https://mail.python.org/pipermail/python-dev/2016-September/146327.html if you haven't seen it ; it's basically a discussion around these subjects. – mgc Oct 11 '16 at 15:11
  • Info [here](https://dl.dropboxusercontent.com/u/3967849/sfmu2/_build/html/index.html) from Raymon Hettinger including the original code recipe for the new dict. Interestingly he says: "At the time it was presented, the mood was opposed to dicts being ordered, so this [original] recipe intentionally fills in deleted values with the last entry in the list." – Chris_Rands Dec 09 '16 at 14:21
  • 2
    If kwargs are now supposed to be ordered (which is nice idea) and kwargs are dict, not OrderedDict, then I guess one could assume that dict keys will stay ordered in the future version of Python, despite the documentation says otherwise. – Dmitriy Sintsov Jan 12 '17 at 12:32
  • 6
    @DmitriySintsov No, don't make that assumption. This was an issue brought up during the writing of the PEP that defines order preserving feature of `**kwargs` and as such the wording used is diplomatic: *`**kwargs` in a function signature is now guaranteed to be an insertion-order-preserving **mapping***. They've used the term *mapping* in order to not force any other implementations to make the dict ordered (and use an `OrderedDict` internally) and as a way to signal that this isn't supposed to depend on the fact that the `dict` is not ordered. – Dimitris Fasarakis Hilliard Feb 04 '17 at 17:18
  • 13
    A good [video explanation](https://www.youtube.com/watch?v=p33CVV29OG8) from Raymond Hettinger – Alex Jul 22 '17 at 16:38
  • 1
    @wazoox, the ordering and complexity of the hashmap hasn't changed. The change makes the hashmap smaller by wasting less space, and the saved space is (usually?) more than the auxiliary array takes. Faster, smaller, ordered - you get to pick all 3. – John La Rooy Aug 08 '17 at 20:28
  • Any way to have `OrderedDict`s automatically converted to ordinary `dict`s in Python 3.7, or must one switch manually by testing what version of Python is running? – martineau Jun 03 '19 at 09:36
  • 1
    @martineau Might be worth a separate question, but not that I know of. The performance benefit of switching I guess would be mild. Plus you might want an `OrderedDict` still, even in Python 3.7 https://stackoverflow.com/questions/50872498/will-ordereddict-become-redundant-in-python-3-7/50872567#50872567 – Chris_Rands Jun 03 '19 at 10:04
  • Chris: Good points in linked answer. I think there's a large number of `OrderedDict` use cases that don't use those "advanced" features—which is why I asked—but it's easy enough to test what version of Python is being used and pick which one you want when they can be used interchangeably. – martineau Jun 03 '19 at 10:12
  • The "it's considered an implementation detail" note in the docs was put there to help people avoid bugs in code written for 3.6 but was later run on 3.5. If people waited until 3.7 to rely on ordering, they wouldn't get a hard to find bug in the previous version 3.6. That was the hope and mostly it worked out and the transition was smooth. Ordering was present in 3.6 and various PEPs for 3.6 (like **kwargs guaranteed ordering) made it impossible to be otherwise. Also once 3.6 released, it was guaranteed not to change. – Raymond Hettinger Jun 22 '22 at 12:43
  • @RaymondHettinger: The rules in 3.6 didn't actually force *always* ordered, they just implied ordering for a freshly constructed `dict` with no deletions and reinsertions. The guarantee wasn't made initially because there was (initially) some debate over whether Python might want to reuse deleted slots in the flat array; you could avoid recompacting the compact storage, at the expense of ruining insertion-ordering (if `'a'`, `'b'`, and `'c'` are inserted, `'b'` is deleted, then `'d'` inserted, reusing `'b'`'s storage would break ordering but avoid recompaction work later). – ShadowRanger Oct 15 '22 at 11:25
  • The guarantees made in 3.6 boiled down to an implicit guarantee of "If you build a `dict` purely by inserting new keys or reassigning existing keys, never deleting keys, the `dict` will be insertion-ordered", with 3.7 making the further decision that it was acceptable to waste a little memory and add some (intermittent, amortized) recompaction work for the case of intermingled deletions and insertions in exchange for preserving that guarantee across *all* `dict`s. – ShadowRanger Oct 15 '22 at 11:28
  • Nope. Your revisionist history is incorrect. Recompaction was present from the very first checkin for 3.6. At the sprint, Victor and I reviewed the patch and Guido was insistent that ordering be preserved in the face of deletions. PEP 520 made this a requirement. We had tests for order preservation including deletion/reinsertion that were checked in at the same time. Nothing changed in 3.7. – Raymond Hettinger Oct 15 '22 at 17:33

6 Answers6

814

Are dictionaries ordered in Python 3.6+?

They are insertion ordered[1].

As of Python 3.6, for the CPython implementation of Python, dictionaries remember the order of items inserted. This is considered an implementation detail in Python 3.6; you need to use OrderedDict if you want insertion ordering that's guaranteed across other implementations of Python (and other ordered behavior[1]).

As of Python 3.7, this is a guaranteed language feature, not merely an implementation detail. From a python-dev message by GvR:

Make it so. "Dict keeps insertion order" is the ruling. Thanks!

This simply means that you can depend on it. Other implementations of Python must also offer an insertion ordered dictionary if they wish to be a conforming implementation of Python 3.7.


How does the Python 3.6 dictionary implementation perform better[2] than the older one while preserving element order?

Essentially, by keeping two arrays.

  • The first array, dk_entries, holds the entries (of type PyDictKeyEntry) for the dictionary in the order that they were inserted. Preserving order is achieved by this being an append only array where new items are always inserted at the end (insertion order).

  • The second, dk_indices, holds the indices for the dk_entries array (that is, values that indicate the position of the corresponding entry in dk_entries). This array acts as the hash table. When a key is hashed it leads to one of the indices stored in dk_indices and the corresponding entry is fetched by indexing dk_entries. Since only indices are kept, the type of this array depends on the overall size of the dictionary (ranging from type int8_t(1 byte) to int32_t/int64_t (4/8 bytes) on 32/64 bit builds)

In the previous implementation, a sparse array of type PyDictKeyEntry and size dk_size had to be allocated; unfortunately, it also resulted in a lot of empty space since that array was not allowed to be more than 2/3 * dk_size full for performance reasons. (and the empty space still had PyDictKeyEntry size!).

This is not the case now since only the required entries are stored (those that have been inserted) and a sparse array of type intX_t (X depending on dict size) 2/3 * dk_sizes full is kept. The empty space changed from type PyDictKeyEntry to intX_t.

So, obviously, creating a sparse array of type PyDictKeyEntry is much more memory demanding than a sparse array for storing ints.

You can see the full conversation on Python-Dev regarding this feature if interested, it is a good read.


In the original proposal made by Raymond Hettinger, a visualization of the data structures used can be seen which captures the gist of the idea.

For example, the dictionary:

d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'}

is currently stored as [keyhash, key, value]:

entries = [['--', '--', '--'],
           [-8522787127447073495, 'barry', 'green'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           ['--', '--', '--'],
           [-9092791511155847987, 'timmy', 'red'],
           ['--', '--', '--'],
           [-6480567542315338377, 'guido', 'blue']]

Instead, the data should be organized as follows:

indices =  [None, 1, None, None, None, 0, None, 2]
entries =  [[-9092791511155847987, 'timmy', 'red'],
            [-8522787127447073495, 'barry', 'green'],
            [-6480567542315338377, 'guido', 'blue']]

As you can visually now see, in the original proposal, a lot of space is essentially empty to reduce collisions and make look-ups faster. With the new approach, you reduce the memory required by moving the sparseness where it's really required, in the indices.


[1]: I say "insertion ordered" and not "ordered" since, with the existence of OrderedDict, "ordered" suggests further behavior that the `dict` object *doesn't provide*. OrderedDicts are reversible, provide order sensitive methods and, mainly, provide an order-sensive equality tests (`==`, `!=`). `dict`s currently don't offer any of those behaviors/methods.
[2]: The new dictionary implementations performs better **memory wise** by being designed more compactly; that's the main benefit here. Speed wise, the difference isn't so drastic, there's places where the new dict might introduce slight regressions (key-lookups, for example) while in others (iteration and resizing come to mind) a performance boost should be present. Overall, the performance of the dictionary, especially in real-life situations, improves due to the compactness introduced.
Mateen Ulhaq
  • 24,552
  • 19
  • 101
  • 135
Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
  • 28
    So, what happens when an item is removed? is the `entries` list resized? or is a blank space kept? or is it compressed from time to time? – njzk2 Oct 11 '16 at 19:19
  • 32
    @njzk2 When an item is removed, the corresponding index is replaced by [`DKIX_DUMMY`](https://github.com/python/cpython/blob/master/Objects/dict-common.h#L19) with a value of `-2` and the entry in the `entry` array [replaced by `NULL`](https://github.com/python/cpython/blob/master/Objects/dictobject.c#L1823), when inserting is performed the new values are appended to the entries array, Haven't been able to discern yet, but pretty sure when the indices fills up beyond the `2/3` threshold resizing is performed. This can lead to shrinking instead of growing if many `DUMMY` entries exist. – Dimitris Fasarakis Hilliard Oct 11 '16 at 20:03
  • Have you noticed any difference speed wise with the new dict implementation? – Chris_Rands Mar 13 '17 at 22:29
  • 3
    @Chris_Rands Nope, the only actual regression I've seen is on the tracker in a [message by Victor](http://bugs.python.org/msg275587). Other than that microbenchmark, I've seen no other issue/message indicating a serious speed difference in real-life work loads. There's places where the new dict might introduce slight regressions (key-lookups, for example) while in others (iteration and resizing come to mind) a performance boost would be present. – Dimitris Fasarakis Hilliard Mar 14 '17 at 13:26
  • 4
    *Correction on the resizing part*: Dictionaries don't resize when you delete items, they re-calculate when you re-insert. So, if a dict is created with `d = {i:i for i in range(100)}` and you `.pop` all items w/o inserting, the size won't change. When you add to it again, `d[1] = 1`, the appropriate size is calculated and the dict resizes. – Dimitris Fasarakis Hilliard Aug 02 '17 at 19:47
  • @JimFasarakisHilliard what are the values in the `indices` list? how 0, 1, 2 is translated to the actual object? is it just for clarity, or that is the actual value inside that list? I thought it would hold the `hash` value of the key – Chen A. Sep 03 '17 at 18:13
  • Any thoughts on what will happen to `OrderedDict` in the future, I guess it will be maintained for backward compatibility? Currently `OrderedDict` supports `reversed()` iteration and the `OrderedDict.move_to_end()` method, but maybe these will be added to plain `dict` too? – Chris_Rands Apr 09 '18 at 20:55
  • 9
    @Chris_Rands I'm pretty sure it is staying. The thing is, and the reason why I changed my answer to remove blanket statements about '`dict` being ordered', `dict`s aren't ordered in the sense `OrderedDict`s are. The notable issue is equality. `dict`s have order insensitive `==`, `OrderedDict`s have order sensitive ones. Dumping `OrderedDict`s and changing `dicts` to now have order sensitive comparisons could lead to a lot of breakage in old code. I'm guessing the only thing that might change about `OrderedDict`s is its implementation. – Dimitris Fasarakis Hilliard Apr 10 '18 at 16:57
  • 2
    A related S.O discussion can be found [here](https://stackoverflow.com/questions/49159238/why-does-the-kwargs-mapping-compare-equal-with-a-differently-ordered-ordereddi). – Dimitris Fasarakis Hilliard Apr 10 '18 at 16:59
  • 3
    @JimFasarakisHilliard Thank you for your detail answer. I translated it to Korean and distributed to Facebook groups Python Korea. https://blog.sinwoobang.me/post/176050610602/pythondictorder. Many of Pythonista in Korea got help from your post. Thank you again. – sinwoobang Aug 29 '18 at 06:01
  • @Chris_Rands: FYI, `reversed` support [is coming in Python 3.8](https://bugs.python.org/issue33462). I don't expect to see `move_to_end` though; the compact ordered `dict` design doesn't support that as well (it would have to leave a dummy entry behind, and find and update the associated index entry each time) eventually either needlessly expanding the entries array, or forcing a complete rehash to clear dummies. By comparison, `OrderedDict` just tweaks a couple pointers. Any algorithm that relies on `move_to_end` should use `OrderedDict`; adding support to `dict` would encourage bad code. – ShadowRanger Jun 19 '19 at 01:34
  • What do `-9092791511155847987`, `-8522787127447073495` and `-6480567542315338377` mean? – tonix Oct 13 '19 at 09:33
  • Is creation order still not guaranteed in 3.7? e.g. `a = {'one': 1, 'two': 2}`? – naught101 Nov 19 '19 at 00:04
  • @tonix those are the hash values for the dictionary keys. – Dimitris Fasarakis Hilliard Nov 21 '19 at 07:48
  • 2
    @naught101: Creation order is guaranteed in 3.7; in your example, `'one'` will always iterate before `'two'` (where before 3.6 it would change from run to run, thanks to the per-run hash seed used to perturb string hashes to defend against denial-of-service attacks using keys that could otherwise be crafted to have identical hashes). – ShadowRanger Feb 20 '20 at 21:00
  • One more thank you for the detailed answer. And I think it could be a useful comment that insertion ordering means that for example if you build many dicts the same way (same order of key adding without key repetition), they share the same order when later iterated, which is handy for data oriented applications. – matanster Oct 04 '20 at 09:04
  • 1
    How about `set` data structure ? – Toilal Feb 05 '21 at 15:09
  • collections.defaultdict also keep order ? – Pegasus Jan 23 '22 at 15:22
  • How can I use OrderedDict if the package is run in python <3.6 and dict if not then? – MappaM Jun 14 '22 at 07:09
  • @Toilal: `set` is not being changed; the benefits are smaller for larger data structures (`set`s more regularly get big), and it has active negatives for data structures which undergo regular insertion *and* deletion of elements (where most `dict`s used under the hood for instance and class attributes, keyword arguments, globals, etc. are populated once and only rarely deleted from). This change provides smaller benefits and the costs would be paid more often for common `set` use cases, so they won't be doing it. – ShadowRanger Sep 24 '22 at 00:43
  • @Pegasus: Yes, all subclasses of `dict` (which includes `defaultdict` and `Counter` from `collections`) inherit the same behavior. – ShadowRanger Sep 24 '22 at 00:44
  • @MappaM: `if sys.version_info < (3, 6): from collections import OrderedDict as odict`, `else: odict = dict`. – ShadowRanger Sep 24 '22 at 00:45
  • can we access the `entries` array? It might be faster than `list(dict.values())[n]` or `next(itertools.islice(dict.values(), n, None))` – theonlygusti Apr 21 '23 at 03:19
83

Below is answering the original first question:

Should I use dict or OrderedDict in Python 3.6?

I think this sentence from the documentation is actually enough to answer your question

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon

dict is not explicitly meant to be an ordered collection, so if you want to stay consistent and not rely on a side effect of the new implementation you should stick with OrderedDict.

Make your code future proof :)

There's a debate about that here.

EDIT: Python 3.7 will keep this as a feature see

Maresh
  • 4,644
  • 25
  • 30
36

Update: Guido van Rossum announced on the mailing list that as of Python 3.7 dicts in all Python implementations must preserve insertion order.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
fjsj
  • 10,995
  • 11
  • 41
  • 57
  • 3
    Now that key ordering is the official standard, what is the purpose of the OrderedDict? Or, is it now redundant? – Jonny Waffles Jun 14 '18 at 18:54
  • 6
    I guess OrderedDict will not be redundant because it has the `move_to_end` method and its equality is order sensitive: https://docs.python.org/3/library/collections.html#collections.OrderedDict. See the note on Jim Fasarakis Hilliard's answer. – fjsj Jun 15 '18 at 20:05
  • @JonnyWaffles see Jim’s answer and this Q&A https://stackoverflow.com/questions/50872498/will-ordereddict-become-redundant-in-python-3-7/ – Chris_Rands Jun 25 '18 at 21:15
  • 5
    If you want your code to run the same on 2.7 and 3.6/3.7+, you need to use OrderedDict – boatcoder Jun 28 '18 at 17:51
  • 7
    Likely there will be an "UnorderedDict" soon for folks that like to hassle their dicts for security reasons ;p – ZF007 Jun 03 '19 at 00:18
23

I wanted to add to the discussion above but don't have the reputation to comment.

Python 3.8 includes the reversed() function on dictionaries (removing another difference from OrderedDict.

Dict and dictviews are now iterable in reversed insertion order using reversed(). (Contributed by Rémi Lapeyre in bpo-33462.) See what's new in python 3.8

I don't see any mention of the equality operator or other features of OrderedDict so they are still not entirely the same.

Gulzar
  • 23,452
  • 27
  • 113
  • 201
rkengler
  • 416
  • 5
  • 6
15

To fully answer this question in 2020, let me quote several statements from official Python docs:

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.

Changed in version 3.7: Dictionary order is guaranteed to be insertion order.

Changed in version 3.8: Dictionaries are now reversible.

Dictionaries and dictionary views are reversible.

A statement regarding OrderedDict vs Dict:

Ordered dictionaries are just like regular dictionaries but have some extra capabilities relating to ordering operations. They have become less important now that the built-in dict class gained the ability to remember insertion order (this new behavior became guaranteed in Python 3.7).

Peng
  • 1,393
  • 13
  • 19
1

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6.

storenth
  • 967
  • 11
  • 18