0

I am using a dictionary of dataframes to do some analysis on NFL teams. I need to loop through the dictionaries (backwards, ordered by time of insertion) for the analysis I plan to do. Each NFL team gets its own dictionary. screenshot of code and output

My functions iterate through the dictionary with code similar to the line displayed at the top. Each key is a tuple, and the second entry in the tuple denotes the week (of the NFL season) the game was played. I initially inserted week 1's key and value, then week 2's key and value, then week 3's key and value. Seeing the output, this works as planned and means my functions should work as they are meant to. No problems in practice. However, if you view the dictionary itself, the keys are out of order (see the second output).

So what exactly determines the order of the keys when you view the dictionary? The Buccaneers dictionary goes 2 -> 1 -> 3. But this is not the case for each team's dictionary; the order seems completely random. What determines this order? I am curious (I definitely inserted them in 1 -> 2 -> 3 order for every team). I am using Python 3.6

PumpMan
  • 11
  • 2
  • 1
    This sounds like you're using an outdated IPython kernel that sorts the keys when displaying dicts. Update your IPython. – user2357112 Oct 08 '20 at 00:49

1 Answers1

0

See this question for details. To summarize, dictionaries are ordered in the insertion order since CPython 3.6, but that was an implementation detail before Python 3.7 specifications. The doc states:

Changed in version 3.7: Dictionary order is guaranteed to be insertion order.

Hence the answer to your question is:

  • if you mean CPython specifically, the dictionary order is the insertion order (though that is not guaranted by the specs and one can imagine, in theory, a patch to CPython 3.6 that breaks this behaviour)
  • if you mean any implementation (CPython, Jython, PyPy...), the implementation determines the dictionary order: there is no guarantee on the order (unless specified by the implementation).

You might ask why there are implementations of dictionaries that are not ordered by insertion order. I suggest you check the hash table data structure. Basically, the values are put in an array, depending on the hash of the key. The hash is a function that maps a key to the index of an array cell. This is why the lookup is so fast: take the key, compute the hash, read the value in the cell (I ignore the collision resolution details), instead of scanning a whole list of (key, value) pairs for instance.

There is no guarantee that the order of the hashed keys is the same as the order of insertion of the keys (or the order of the keys themselves). If you list the keys by scanning the array, the order of the keys appears to be random.


Remark: you can use the OrderDict class to force the keys to be ordered, but that's the order of keys (e.g. 'Opponent' < 'Reference').

jferard
  • 7,835
  • 2
  • 22
  • 35