144

What's the correct way to initialize an ordered dictionary (OD) so that it retains the order of initial data?

from collections import OrderedDict

# Obviously wrong because regular dict loses order
d = OrderedDict({'b':2, 'a':1}) 

# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b',2), ('a', 1)])

# What about using a list comprehension, will 'd' preserve the order of 'l'
l = ['b', 'a', 'c', 'aa']
d = OrderedDict([(i,i) for i in l])

Question:

  • Will an OrderedDict preserve the order of a list of tuples, or tuple of tuples or tuple of lists or list of lists etc. passed at the time of initialization (2nd & 3rd example above)?

  • How does one go about verifying if OrderedDict actually maintains an order? Since a dict has an unpredictable order, what if my test vectors luckily have the same initial order as the unpredictable order of a dict? For example, if instead of d = OrderedDict({'b':2, 'a':1}) I write d = OrderedDict({'a':1, 'b':2}), I can wrongly conclude that the order is preserved. In this case, I found out that a dict is ordered alphabetically, but that may not be always true. What's a reliable way to use a counterexample to verify whether a data structure preserves order or not, short of trying test vectors repeatedly until one breaks?

P.S. I'll just leave this here for reference: "The OrderedDict constructor and update() method both accept keyword arguments, but their order is lost because Python’s function call semantics pass-in keyword arguments using a regular unordered dictionary"

P.P.S : Hopefully, in future, OrderedDict will preserve the order of kwargs also (example 1): http://bugs.python.org/issue16991

Machavity
  • 30,841
  • 27
  • 92
  • 100
click
  • 2,093
  • 2
  • 15
  • 21
  • 11
    It's vaguely ironic that initializing an OrderedDict with a (non-empty) dict is the *wrong* thing to do... arguably that should result in a Warning since it probably violates the user's intent. – smci Dec 03 '16 at 02:30
  • 6
    After python3.6, `OrderDict(b=2, a=1)` is also a proper way. See [PEP 468](https://www.python.org/dev/peps/pep-0468/). – IvanaGyro Aug 10 '19 at 08:28

3 Answers3

108

The OrderedDict will preserve any order that it has access to. The only way to pass ordered data to it to initialize is to pass a list (or, more generally, an iterable) of key-value pairs, as in your last two examples. As the documentation you linked to says, the OrderedDict does not have access to any order when you pass in keyword arguments or a dict argument, since any order there is removed before the OrderedDict constructor sees it.

Note that using a list comprehension in your last example doesn't change anything. There's no difference between OrderedDict([(i,i) for i in l]) and OrderedDict([('b', 'b'), ('a', 'a'), ('c', 'c'), ('aa', 'aa')]). The list comprehension is evaluated and creates the list and it is passed in; OrderedDict knows nothing about how it was created.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
82
# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b', 2), ('a', 1)])

Yes, that will work. By definition, a list is always ordered the way it is represented. This goes for list-comprehension too, the list generated is in the same way the data was provided (i.e. source from a list it will be deterministic, sourced from a set or dict not so much).

How does one go about verifying if OrderedDict actually maintains an order. Since a dict has an unpredictable order, what if my test vectors luckily has the same initial order as the unpredictable order of a dict?. For example, if instead of d = OrderedDict({'b':2, 'a':1}) I write d = OrderedDict({'a':1, 'b':2}), I can wrongly conclude that the order is preserved. In this case, I found out that a dict is order alphabetically, but that may not be always true. i.e. what's a reliable way to use a counter example to verify if a data structure preserves order or not short of trying test vectors repeatedly until one breaks.

You keep your source list of 2-tuple around for reference, and use that as your test data for your test cases when you do unit tests. Iterate through them and ensure the order is maintained.

metatoaster
  • 17,419
  • 5
  • 55
  • 66
  • About verifying the order: How do I make sure that my 2-tuple WILL break the order of dict if it's unpredictable? This is a generic question about any data structure, perhaps I should split it from this question. – click Aug 25 '14 at 06:37
  • 1
    You can't deterministically break something that is non-deterministic in nature. – metatoaster Aug 25 '14 at 06:38
  • 1
    So what's the right approach to test such things? You just keep trying indefinitely? The order is unpredictable for programmers, but since it's a hash map, it follows 'some' algorithm & a right test should try to counter that? – click Aug 25 '14 at 06:40
  • 2
    See [`__hash__`](http://docs.python.org/dev/reference/datamodel.html#object.__hash__). Specifically about the `str` type. – metatoaster Aug 25 '14 at 06:43
  • _By definition, a list is always ordered the way it is represented._ This was a key statement for me. I decided to simply use a list of 2-tuples for my basic `OrderedDict` so that I don't have the overhead of converting a list to an `OrderedDict`. I just loop through the elements like a list instead of a dictionary. – Bobort May 18 '17 at 16:17
0

It is also possible (and a little more efficient) to use a generator expression:

d = OrderedDict((i, i) for i in l)

Obviously, the benefit is negligible in this trivial case for l, but if l corresponds to an iterator or was yielding results from a generator, e.g. used to parse and iterate through a large file, then the difference could be very substantial (e.g. avoiding to load the entire contents onto memory). For example:

def mygen(filepath):
    with open(filepath, 'r') as f:
        for line in f:
            yield [int(field) for field line.split()]

d = OrderedDict((i, sum(numbers)) for i, numbers in enumerate(mygen(filepath)))
Ataxias
  • 1,085
  • 13
  • 23