1

I need to quickly hash a dictionary (a counter), and I’m noticing that python seems to order dictionaries with the same keys in the same order, even if they are constructed differently. In fact the dictionaries seem to be able to survive quite a bit of abuse:

>>> D = {'a': 1, 'b': 2, 'c': 3}
>>> list(D)
['b', 'c', 'a']
>>> list(D)
['b', 'c', 'a']
>>> list(D)
['b', 'c', 'a']
>>> list(D)
['b', 'c', 'a']
>>> E = {'a': 1, 'b': 2, 'c': 3}
>>> list(E)
['b', 'c', 'a']
>>> list(E)
['b', 'c', 'a']
>>> list(E)
['b', 'c', 'a']
>>> F = {'a': 1, 'b': 2, 'c': 3}
>>> list(F)
['b', 'c', 'a']
>>> list(F)
['b', 'c', 'a']
>>> list(F)
['b', 'c', 'a']
>>> list(F)
['b', 'c', 'a']
>>> G = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> list(G)
['b', 'c', 'a', 'd']
>>> list(G)
['b', 'c', 'a', 'd']
>>> list(G)
['b', 'c', 'a', 'd']
>>> list(F)
['b', 'c', 'a']
>>> F.pop('a')
1
>>> list(F)
['b', 'c']
>>> F['a'] = 2
>>> list(F)
['b', 'c', 'a']
>>> list(F)
['b', 'c', 'a']
>>> H = {'b': 2, 'a': 1, 'c': 3}
>>> list(H)
['b', 'c', 'a']
>>> H = {'b': 2, 'c': 1, 'a': 3}
>>> list(H)
['b', 'c', 'a']
>>> K = {'b': 2, 'c': 1, 'a': 3, 'd': 4}
>>> list(K)
['b', 'c', 'a', 'd']
>>> K = {'b': 2, 'c': 1, 'd': 3, 'a': 4}
>>> list(K)
['b', 'c', 'a', 'd']

My question is then, if my dictionaries have the same keys and the same values, can I count on the keys being in the same order, at least for the lifetime of that running instance of python? Note that I’m aware python is a bit incomprehensible in how it decides to order a dictionary, but I want to know if given the same inputs, the same instance of python will return the same key ordering each time.

kelvinsong
  • 218
  • 1
  • 7
  • Non python specific note: when hash table grows or shrinks it may re-balance buckets and hence change order if items - you may want to experiment with adding/removing items to see if order suddenly change... – Alexei Levenkov Feb 06 '16 at 04:00
  • 1
    Not reliable. Please dont rely on – dlmeetei Feb 06 '16 at 04:01
  • 1
    Possible duplicate of [Why is the order in Python dictionaries and sets arbitrary?](http://stackoverflow.com/questions/15479928/why-is-the-order-in-python-dictionaries-and-sets-arbitrary) – Eli Korvigo Feb 06 '16 at 04:01
  • For newcomers, [**dictionary order is now guaranteed to be insertion order**](https://docs.python.org/3/library/stdtypes.html#dict.values) as of Python 3.7 – Ari Cooper-Davis Feb 25 '20 at 13:58

4 Answers4

2

Regular python dicts are not ordered. It is never guaranteed that when you get the list of keys that they will be the order you expect them to be.

If you want to preserve order, use an ordered dict.

https://docs.python.org/2/library/collections.html#collections.OrderedDict

BrockLee
  • 931
  • 2
  • 9
  • 24
1

if my dictionaries have the same keys and the same values, can I count on the keys being in the same order

No.

>>> list({'d': 0, 'l': 0})
['d', 'l']

>>> list({'l': 0, 'd': 0})
['l', 'd']
Stefan Pochmann
  • 27,593
  • 8
  • 44
  • 107
  • What do you mean it broke? – Stefan Pochmann Feb 06 '16 at 04:08
  • `list({'a': 0, 'e': 0})` → `['a', 'e']`; `list({'e': 0, 'a': 0})` → `['a', 'e']` for me. "Python 2.7.11 (default, Jan 11 2016, 21:04:40)" – Kijewski Feb 06 '16 at 04:09
  • 1
    I guess we can't even rely on the unreliability :-P. How about the new letters? I checked both Python 2 and 3 this time. Interestingly it's the *only* pair for me that "works" in both versions. – Stefan Pochmann Feb 06 '16 at 04:20
  • [d,l] [l,d] yeah this example works for me, proving that order can be affected when you construct differently; hopes of original poster dashed. – thorr18 Feb 06 '16 at 04:27
1

Python >3.7

Dictionary order is guaranteed to be insertion order.

Python <3.7

In terms of the language definition, no you cannot rely on stable ordering, because it is not promised in the language definition.

Now, it might be that over the short- and medium-term you will find that this ordering is stable, and this makes sense: computers are deterministic, so it's reasonable to expect the same results from one iteration of the experiment to the next. (however, since they are complex systems, this nondeterministic machine might still produce unexpected results, since you don't know the factors that are determinant) However, this reasoning does not extend to the long-term, which is what you should be programming to, because the language implementation is free to choose any means of ordering those keys that it likes, and to change that choice at any time, as long as the implementation is consistent with the language definition. This means that programs depending on some order remaining stable are subject to breakage if run under different implementations, and they are subject to breakage when the implementation is updated. This is not a place you want to be, therefore you should not make any assumptions about the stability of ordering of dictionary keys.

That being said, if you are only concerned about stability just across the lifetime of one running instance of python then this seems like a safe gamble - again, computers are deterministic - but still a gamble. Test carefully against cases rather more complex than the ones you're expecting to encounter, and then decide whether that chopping block looks like a comfortable place to rest your neck.

Ari Cooper-Davis
  • 3,374
  • 3
  • 26
  • 43
Jon Kiparsky
  • 7,499
  • 2
  • 23
  • 38
  • 1
    "to change that choice at any time", correct. Important to note here is that the language does not do this because it hates you, but because an unstable ordering is much faster and much more difficult to attack. – Kijewski Feb 06 '16 at 04:16
  • the reason I am considering this is because it it not the end of the world if the key order changes, the program just takes a performance hit because an expensive calculation must be done twice (since it won’t realize it has already been done). But skipping the sorted() step in `''.join(k + str(v) for k, v in sorted(D.items(), key = lambda k: k[0]))` yields a big savings in performance on the part of the code that fetches the cached values – kelvinsong Feb 06 '16 at 16:11
  • That sounds reasonable, and even interesting, but is there any reason you're not considering the use of OrderedDict to ensure consistency? – Jon Kiparsky Feb 09 '16 at 01:40
1

Given that nobody mentioned this yet, I'll tell you that hash randomization is enabled by default since Python 3.3.

With hash randomization, the result of hash('abc') is different between each Python run. Because hashes are at the base of dictionaries (they are used to determine the location of the item in the internal array used by dict), there are even fewer guarantees about ordering.

$ python3.5
>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> list(d)
['a', 'c', 'b']
>>> list(d)
['a', 'c', 'b']

$ python3.5
# new process, new random seed, new ordering
>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> list(d)
['c', 'a', 'b']
>>> list(d)
['c', 'a', 'b']
Andrea Corbellini
  • 17,339
  • 3
  • 53
  • 69