2

I am trying to understand the differences between python dictionaries in python 3.6.7 and python 3.5.2. The way they store the order of key-value pairs seems different.

For example, assume there is a dictionary named di:

    di = {'a':1,'A':1,'b':2, 'B':2, 'c':3, 'C':3}

in Python 3.5.2, when I print di, the output is:

    {'C': 3, 'a': 1, 'A': 1, 'B': 2, 'c': 3, 'b': 2}

However, in Python 3.6.7, it is:

    {'a': 1, 'A': 1, 'b': 2, 'B': 2, 'c': 3, 'C': 3}

What have been changed between the two versions? How can I make my code order the result of python 3.6.7 similar to 3.5.2's.

P.S. I know that there is actually no order in Python dictionary. The term order here is used to make the reader easy to understand my question. Thank you.

Long
  • 1,482
  • 21
  • 33
  • 5
    From Python 3.6+, python dictionaries store key-value pairs in the order they were inserted. – Arkistarvh Kltzuonstev May 13 '19 at 12:45
  • 1
    related: https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6 – hiro protagonist May 13 '19 at 12:46
  • You recognize that there is no order in dictionaries - the order is implementation dependent, and may change in future versions. Therefore depending on such order a mistake that is bound to break your code at one point or another – Arthur.V May 13 '19 at 12:47
  • if you want to be able to reproduce this behaviour in python 3.5, you need to use `collections.OrderedDict`. – hiro protagonist May 13 '19 at 12:47
  • 3
    @ArkistarvhKltzuonstev the fact that dict() is ordered in py3.6 is an implementation detail. OrderedDict should be use if ordered behaviour is required. edit, actually, 3.7 canonised that insert-order nature of dict(). Still, use OrderedDict if you want the order. – Pod May 13 '19 at 12:49
  • So to be clear, you don't like the order-preservation of 3.6, and would prefer if dicts were scrambled the way they were in 3.5? How "similar" does it need to be? If all you want is "any order other than the one that the literal value is in", then maybe you could scramble it with the `random` module and a little work... – Kevin May 13 '19 at 12:49
  • 1
    Usually most people don't need to rely on the dictionary key order except under specific requirement. If what you need is export dictionary in order to JSON file, the json.dump module actually has a key order parameter. – mootmoot May 13 '19 at 12:54
  • Given that the order in Python 3.5.2 could change from run to run, what would you consider to be a "similar" order? – chepner May 13 '19 at 13:06

2 Answers2

4

TLDR:

To replicate the hash-based ordering, you must take an explicitly ordered dict and impose your own ordering.

from collections import OrderedDict

def cpy35dict(source_dict):
    """Apply hash-ordering to a dictionary"""
    return OrderedDict(  # ensure stable ordering
        sorted(          # order items by key hash
            source_dict.items(),
            key=lambda item: hash(item[0])
        )
    )

This replicates the ordering used in CPython up to Python 3.5. Note that the Python language specification makes no guarantees on order prior to Python 3.7 - you should also use this in Python 3.5 and before if you insist on the ordering.


There are basically three types of order for dict-like containers:

  • no order: There is no specific order to items. Every access may return a different order.
  • specific order: There is a well-defined order to items. Every access returns the same order.
  • arbitrary order: The ordering is not defined. It may or may not use a specific order.

As per the Python specification, dict has arbitrary order up to Python 3.6 and insertion order since Python 3.7.

Changed in version 3.7: Dictionary order is guaranteed to be insertion order. This behavior was an implementation detail of CPython from 3.6. [Mapping Types — dict¶]

However, arbitrary order does not exclude specific order. It basically means "whatever the implementation feels like". Different implementations use different schemes to implement dict.

  • PyPy uses insertion order since Python 2.7/3.2

  • CPython 3.5- uses the ordering of the underlying hash function. Since several types have salted hashes, this means you get different order on each invocation.

  • CPython 3.6 uses the ordering by which items are inserted. This is explicitly an implementation detail.

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon [What's new in Python 3.6]

In other words, code for Python 3.6 and earlier should make no assumptions about ordering of dict. Only code for Python 3.7 and later should make assumptions about ordering of dict.

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
  • 1
    A note: Sorting on the hash won't actually mimic pre-3.6 behavior. The problem is that the order pre-3.6 is based on the hash modulo the number of buckets (so for example, if the underlying storage has 16 buckets, a hash with a remainder of 0 mod 16 will usually come before one with a remainder of 1 mod 16, even if the 0 hash is `0xFFFFFFF0` and the 1 hash is `0x1`). Beyond that, in the case of collisions (where two hashes mod the bucket count have the same value), the second item's bucket can end up somewhere completely different, so it's not even sorted by the low bits of the hash. – ShadowRanger Jun 19 '19 at 01:40
3

The order of a dictionary is "allowed" to be different for each run of the program, let alone the version. The fact that it is consistently ordered on each version an implementation detail* that only CPython knows about. Your program should not rely on this behaviour.

How can I make my code order the result of python 3.6.7 similar to 3.5.2's.

Use an OrderedDict!

* Actually, as of Python 3.7, the preservation of insertion-order is officially a part of the language spec.

Pod
  • 3,938
  • 2
  • 37
  • 45
  • While I agree that OrderedDict is probably useful here, can you elaborate further? What should one do with OrderedDict in order to turn `{'a': 1, 'A': 1, 'b': 2, 'B': 2, 'c': 3, 'C': 3}` into `{'C': 3, 'a': 1, 'A': 1, 'B': 2, 'c': 3, 'b': 2}`? – Kevin May 13 '19 at 12:51
  • 1
    It's not "probably useful" here -- it's required :) If you want an ordered dict, use OrderedDict. This even preserves [order across comparisons](https://stackoverflow.com/a/50872567/57461). And to answer OP's sub-question, and therefore your question in the comments: Unless the @Long wants that EXACT py3.5 behaviour to happen in py3.6, I imagine the use of OrderedDict will be enough, as it will ensure consistent behaviour across py3.5 and py3.6. – Pod May 13 '19 at 12:56