8

I've just spent half a day tracing down a bug to a dict that I forgot to sort when iterating over it. Even though that part of code is tested, the tests did not pick it up because that dict had a repeatable ordering during the test. Only when I shuffled the dict did the tests fail! (I used an intermediate random.shuffle'd list and built an OrderedDict)

This kinda scares me, because there might be similar bugs all over the place!

Is there any way to globally force all dicts to be unordered during testing?

Update: I think I at least figured out what caused the bug. As is described here, dicts with int keys are usually sorted, but might not always be. My hypothesis: In my tests, the ints are always small (order 10), and thus always in order. In the real run, however, the ints are much bigger (order 10^13), and thus not always ordered. I was able to reproduce this behaviour in an interactive session: list(foo.keys()) == sorted(foo.keys()) was always True for small keys, but not for every dict with very large keys.

flotzilla
  • 1,181
  • 1
  • 13
  • 23
  • 6
    The dicts are *not* sorted. But when constructing, usually the order is *deterministic* (that's something different). You can use Python with the `-R` flag. – Willem Van Onsem Aug 08 '17 at 15:58
  • With the -R flag, the key order would remain constant for a given Python process, right? – RagingRoosevelt Aug 08 '17 at 16:17
  • 2
    @RagingRoosevelt: yes. But each tests, you would use a different order. So that means that after a few tests, the errors will probably start to occur. Arbitrary testing is for instance one of the things Haskell uses to `QuickCheck`: you generate random lists/integers/strings/... and verify that the conditions hold. – Willem Van Onsem Aug 08 '17 at 16:24
  • @wim: I would not consider this a real answer. – Willem Van Onsem Aug 08 '17 at 16:38
  • Indeed, it is relevant only for Python2 and the question is now tagged Python3. Perhaps you could delete the comments, then. – wim Aug 08 '17 at 16:48
  • @flotzilla: I'm glad you located your issue. Your comment in the update is what I tried to explain to you. Never assume any ordering in dicts or sets. If you need it ordered, you need to create an OrderedDict object and initialize it with the keys and corresponding values per the order in which you want your keys. – Alexander Aug 08 '17 at 18:04
  • I think it's the inverse: I never assumed any ordering, but rather the absence thereof! Thus I never bothered to test for explicit random order. Though that would not even have helped with the int issue; rather, I should use realistic ints in the test instead of 1, 2, 3... Thanks for helping in any case! – flotzilla Aug 08 '17 at 18:11

1 Answers1

1

As of 3.6, dicts maintain key insertion order. If you would like them to behave a specific way, you could create a class that inherits from a dict and give it the behavior you want (eg ensuring that it shuffles the key list before returning it).

Regardless of the version of python you use, be sure to check out the implementation details if you want to try to rely on a specific behavior of the dict implementation. Better yet, for portability, just code a solution that ensures they behave as you expect, either as I mentioned above, or by using shuffle over a list of the keys, or something else.

RagingRoosevelt
  • 2,046
  • 19
  • 34
  • 1
    Wow, didn't know that. Thanks for informing. – JoshuaRLi Aug 08 '17 at 16:03
  • Ah, right, I've read that at some point but I forgot about it since. Thanks for the reminder. – flotzilla Aug 08 '17 at 16:05
  • 2
    Per the notes in the link: "The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon" – Alexander Aug 08 '17 at 16:15
  • @Alexander, edited to make that more clear in my answer, thanks! – RagingRoosevelt Aug 08 '17 at 16:20
  • Even shuffling keys can randomly put them in order. A shuffle is not sufficient. – Alexander Aug 08 '17 at 16:21
  • Indeed: if all the objects have a *different* hashcode. Then Python versions below 3.6 (or the ones that differ on this implementation detail) can - will be sorted the same, regardless of the order of input. – Willem Van Onsem Aug 08 '17 at 16:27
  • Turns out I'm only running 3.4, so this was not the issue. However, my dict keys are integers which turn out to be "usually sorted", which I think caused the bug (see update to question). – flotzilla Aug 08 '17 at 17:56