13

Now that Python 3.7 makes order-preserving dicts officially part of the language spec instead of an implementation detail, I've been trying to wrap my head around how best to use this property. Today, I've found I needed an order preserving set and think the dictionary might do the trick.

Suppose we have a list of hashable element. We want a list of unique entries and we want to keep the order of these entries based on first appearance. A simple dictionary constructor should do the trick:

ls = "Beautiful is better than ugly. Explicit..."
uniques = list({s:0 for s in ls})

>>> ['B', 'e', 'a', 'u', 't', 'i', 'f', 'l', ' ', 's', 'b', 'r', 'h', 'n', 'g', 'y', '.', 'E', 'x', 'p', 'c']

This will preserve the ordering by first appearance and get rid of all duplicates.

I'd like to know what the community thinks of this use case and the order preserving feature in general.

  • Is there any reason this method shouldn't be used?
  • Are there better ways to solve this problem?
  • Is this method Pythonic?

Reading through the Zen of Python, I am conflicted. The method is simple but relies on implicit ordering.

Please let me know what you think. Thank you.

user2357112
  • 260,549
  • 28
  • 431
  • 505

2 Answers2

7

This approach of using a Python 3.7 dictionary as an order-preserving de-dupe is vetted by a core Python developer here. You can't really get a better recommendation than that.

Is there any reason this method shouldn't be used?

No.

Are there better ways to solve this problem?

No.

Is this method Pythonic?

Yes.

The method is simple but relies on implicit ordering.

Your question is tagged python-3.7. Dictionaries preserving insertion order is guaranteed, so there is not an implicit ordering here.

wim
  • 338,267
  • 99
  • 616
  • 750
  • Thanks @wim. I didn't see the 3.7 update. that is the best green light out that. re: implicit, I'd like to argue that being part of the spec still doesn't make it explicit. Explicit would look more like `dict.fromkeys("abc", ordered=True)`. There are lots of behaviors that might be default but you wouldn't know it unless you stumbled upon it. – Nathaniel Rivera Saul Jul 03 '18 at 02:12
  • It's still going to be a very dangerous practice for backward compatibility reasons for quite a while, though. – user2357112 Jul 03 '18 at 02:13
  • @NathanielSaul Well, if you want it to be more explicit then nothing stops you doing the same thing with `collections.OrderedDict.fromkeys(...)`. Personally, I prefer to see the dict comprehension, as long as you don't need to support older Python versions. – wim Jul 03 '18 at 02:19
  • 1
    It seemed more like Hettinger was recommending that there's no faster way to perform the task, not necessarily recommending to use that as the go-to method on into the future. Dictionaries are still viewed in the broader computer science world, by definition, as saying nothing about the ordering of keys. The Python core devs may feel like that doesn't matter for the time being, but what about in 10 years when someone finds a more efficient dict implementation that ignores ordering? – David Sanders Mar 06 '19 at 20:56
  • 1
    @DavidSanders Then I will update this answer in 10 years. – wim Mar 06 '19 at 21:02
7

This works great on Python 3.7!.. but Python 3.7 isn't the only Python version around. Relying on dict order preservation is going to be a dangerous habit for quite a while, because if your code ever runs on a Python version before 3.6, it'll stop maintaining order, completely silently.

Relying on, say, dataclasses or contextvars isn't anywhere near as dangerous, because if you try to run code that relies on dataclasses on a Python that doesn't have dataclasses, you get a big, clear ImportError. Dicts losing their order doesn't have the same obviousness to it.

You may have no idea it's stopped maintaining order. You may not remember you relied on dict order. You might forget to document or tell anyone that you relied on it, or you might be the poor coder who inherits code where someone else relied on dict order without documenting the Python 3.7+ requirement. You may have no idea you forgot to update Python on one particular machine, or that you accidentally dropped out of Anaconda or whatever and you're on the system Python 3 that's still using 3.4.

It'll be safe to assume dict order eventually. For now, especially right now, a few days after the release of 3.7, it's a better idea to use OrderedDict, or add a version check:

import collections
import sys

_make_ordered_mapping = (dict.fromkeys if sys.version_info >= (3, 7)
                         else collections.OrderedDict.fromkeys)

def ordered_dedup(items):
    return list(_make_ordered_mapping(items))
user2357112
  • 260,549
  • 28
  • 431
  • 505