2

I use a set when I need to keep a reference list of values which I want to keep unique (and later on, check if something is in that set). This does not work with dict because it is not hashable.

There are quite a few techniques to "uniquify" a list of dict but all of them assume that I have a final list, which I want to reduce to unique elements.

How to do that in a dynamic way? For a set I would just .add() and element and would know that it will be added only if it is unique. Is such a (EDIT: ideally, but not necessarily) built-in mechanism available for a bag of dict (I use the word "bag" because I do not want to limit possible answers to any data container)

Community
  • 1
  • 1
WoJ
  • 27,165
  • 48
  • 180
  • 345
  • You can implement a custom set-like object using a custom hash function based on dictionaries ? – PinkFluffyUnicorn Mar 17 '17 at 13:38
  • ["Bag" means "multiset"](https://en.wikipedia.org/wiki/Set_(abstract_data_type)#Multiset); maybe you should just ask about hashable/immutable `dict`s and ignore the container's type. – jwodder Mar 17 '17 at 13:38
  • So every time a key or value changes in any `dict` your set-like structure will be re-tested? – Chris_Rands Mar 17 '17 at 13:43
  • What about using an immutable [frozen dict](https://pypi.python.org/pypi/frozendict/)? –  Mar 17 '17 at 14:24
  • @Chris_Rands: I am not sure what you mean? I would like to add dicts to that container and just ensure that there are no duplicates (and later check against that container). Once they are in, they will not change. – WoJ Mar 17 '17 at 15:48
  • @SembeiNorimaki: it looks like this is exactly what I am looking for (from first tests). Would you mind turning this into an answer? I will test more in depth and be back. Thanks. – WoJ Mar 17 '17 at 15:54
  • @WoJ Ah I see, it was not clear to me that the dictionaries would be fixed, in that case clearly some kind of frozen-dict type construct is the way to go, relevant: http://stackoverflow.com/questions/2703599/what-would-a-frozen-dict-be – Chris_Rands Mar 17 '17 at 16:03
  • The problem is easily solved by switching from ``someset.add(somedict)`` to ``someset.update(somedict.items())``. See my answer below for an example. It is pretty easy to do dynamic, incremental updates in this fashion :-) – Raymond Hettinger Mar 19 '17 at 06:35

3 Answers3

1

The set classes are implemented using dictionaries. Accordingly, the requirements for set elements are the same as those for dictionary keys; namely, that the element defines both eq() and hash(). As a result, sets cannot contain mutable elements such as lists or dictionaries. However, they can contain immutable collections such as tuples or instances of ImmutableSet.

So if you only want to use built-ins, you could convert your dictionaries to a tuple of tuples upon entering them to a set, and converting them back to dictionaries when you want to use them.

dict_set = set()
dict_set.add(tuple(a_dict.items()))
Giannis Spiliopoulos
  • 2,628
  • 18
  • 27
  • That is a nice workaround. However, there is no guarantee on the sorting of items (I think) so you may end up with duplicated entries. That may be solved by adding a `sort` there. And this is assuming that both keys and values are hashable (which I think that it is a reasonable assumption). – MariusSiuram Mar 17 '17 at 15:25
  • Thank you - I edited my question to soften the "built-in" part, what I meant was "without completely rewriting a set-like container for dicts". Your solution is great, I have doubts about the order of the tuple elements (similar to @MariusSiuram) – WoJ Mar 17 '17 at 15:46
  • 1
    @MariusSiuram Instead of a `sorted` `tuple`, you could also just use a `frozenset`. – tobias_k Mar 17 '17 at 15:50
1

You can use a frozen dict which is an immutable implementation of a regular dict.

This approach should allow you to use frozen dicts inside a set.

>>> from frozendict import frozendict
>>> x = [frozendict({'a':2, 'b':3}),frozendict({'b':3, 'a':2})]
>>> set(x)
{<frozendict {'b': 3, 'a': 2}>}
>>> frozendict({'b': 3, 'a': 2}) in set(x)
True
>>> frozendict({'b': 4, 'a': 2}) in set(x)
False
>>> frozendict({'a': 2, 'b': 3}) in set(x)
True
WoJ
  • 27,165
  • 48
  • 180
  • 345
  • Thanks. I am adding some code (my first tests) to your answer to show the usage – WoJ Mar 17 '17 at 16:02
0

For a set I would just .add() and element and would know that it will be added only if it is unique.

Instead of add(), use update() or |= with the dictionary's items. That will meet your goal of adding dynamically and incrementally while "knowing that it will be added only if it is unique.":

>>> d = dict(raymond='red')
>>> e = dict(raymond='blue')
>>> f = dict(raymond='red')

>>> s = set()
>>> s |= d.items()
>>> s |= e.items()
>>> s |= f.items()
>>> s
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485