6

How can i make a set of dictionaries from one list of dictionaries?

Example:

import copy

v1 = {'k01': 'v01', 'k02': {'k03': 'v03', 'k04': {'k05': 'v05'}}}
v2 = {'k11': 'v11', 'k12': {'k13': 'v13', 'k14': {'k15': 'v15'}}}

data = []
N = 5
for i in range(N):
    data.append(copy.deepcopy(v1))
    data.append(copy.deepcopy(v2))

print data

How would you create a set of dictionaries from the list data?

NS: One dictionary is equal to another when they are structurally the same. That means, they got exactly the same keys and same values (recursively)

BPL
  • 9,632
  • 9
  • 59
  • 117
  • What is your required tuple output? – Moinuddin Quadri Aug 29 '16 at 10:03
  • `data = set(data)`? – ForceBru Aug 29 '16 at 10:03
  • 1
    @ForceBru `TypeError: unhashable type: 'dict'` – BPL Aug 29 '16 at 10:04
  • Why would you want to do this? Could you expand upon your question a bit? This seems like an exotic solution to a problem that may be better solved otherwise. – Nelewout Aug 29 '16 at 10:06
  • @BPL Shouldn't the title of question be `Convert a dictionary of dictionaries into a set of dictionaries` instead of "Convert a list of dictionaries into a set of dictionaries"? i.e `list` -> `dict`. I am still confused regarding the question – Moinuddin Quadri Aug 29 '16 at 10:14
  • 4
    The [forzendict](https://pypi.python.org/pypi/frozendict) package provides *frozendicts* that are immutable and hashable, so you can create a set of those. – Tim Fuchs Aug 29 '16 at 10:15
  • @TimFuchs: But within set, if two different `frozendict` objects are there even with the same structure. It will be treated at different entitity. I do not think this is what user wants – Moinuddin Quadri Aug 29 '16 at 10:20
  • @TimFuchs Mmm, not sure about the [__hash__](https://github.com/slezica/python-frozendict/blob/master/frozendict/__init__.py#L36) function provided by frozendict, let me test it and I'll find a counterexample – BPL Aug 29 '16 at 10:22
  • @MoinuddinQuadri Judging by the hash function, having the same keys and values produces the same hash, so that fulfulls the OP requirements. – Tim Fuchs Aug 29 '16 at 10:32
  • @BPL, it you don't like their hash, you could use the one from [PEP 416](http://legacy.python.org/dev/peps/pep-0416/#id9) (`hash(frozenset(dict.items()))`) – Tim Fuchs Aug 29 '16 at 10:34
  • @TimFuchs Edited my answer, I'm getting an error using frozendict package – BPL Aug 29 '16 at 10:35
  • 1
    You need frozendicts of frozendicts. – Bernhard Aug 29 '16 at 10:37
  • Does this answer your question? [Python - List of unique dictionaries](https://stackoverflow.com/questions/11092511/python-list-of-unique-dictionaries) – Alexandr Zarubkin Dec 18 '20 at 08:58

4 Answers4

12

A cheap workaround would be to serialize your dicts, for example:

import json

dset = set()

d1 = {'a':1, 'b':{'c':2}}
d2 = {'b':{'c':2}, 'a':1} # the same according to your definition
d3 = {'x': 42}

dset.add(json.dumps(d1, sort_keys=True))
dset.add(json.dumps(d2, sort_keys=True))
dset.add(json.dumps(d3, sort_keys=True))

for p in dset:
    print json.loads(p) 

In the long run it would make sense to wrap the whole thing in a class like SetOfDicts.

georg
  • 211,518
  • 52
  • 313
  • 390
  • Nice one! I haven't been able yet to find any counterexample, it seems it works even when keys are shuffled at deep levels of the dictionaries – BPL Aug 29 '16 at 10:51
  • @BPL: of course this only works as long as your dicts (and all sub-elements) are serializable. – georg Aug 29 '16 at 10:52
  • Fair enough, I can live with that ;) – BPL Aug 29 '16 at 10:52
1

Dictionaries are mutable and therefore not hashable in python.

You could either create a dict-subclass with a __hash__ method. Make sure that the hash of a dictionary does not change while it is in the set (that probably means that you cannot allow modifying the members). See http://code.activestate.com/recipes/414283-frozen-dictionaries/ for an example implementation of frozendicts.

If you can define a sort order on your (frozen) dictionaries, you could alternatively use a data structure based on a binary tree instead of a set. This boils down to the bisect solution provided in the link below.

See also https://stackoverflow.com/a/18824158/5069869 for an explanation why sets without hash do not make sense.

Community
  • 1
  • 1
Bernhard
  • 2,084
  • 2
  • 20
  • 33
  • 1
    Ok, you've pointed out something is already well-known, no solution to the question though – BPL Aug 29 '16 at 10:09
  • What was your question? Why do you want a set? Do you only want to remove duplicates? Do you want to use set operations? – Bernhard Aug 29 '16 at 10:12
  • 2
    @BPL If you already know that `dict` are not `hashable`, then your question has no point. How can you then make set of `dicts`? Either you can make set of `keys` in `dict`, `values` in dict or `both`, BUT not dict itself – Moinuddin Quadri Aug 29 '16 at 10:17
0

not exactly what you're looking for as this accounts for lists too but:

def hashable_structure(structure):
    if isinstance(structure, dict):
        return {k: hashable_structure(v) for k, v in structure.items()}
    elif isinstance(structure, list):
        return {hashable_structure(elem) for elem in structure)}
    else:
        return structure
Zia
  • 2,735
  • 3
  • 30
  • 27
-1

Riffing off of georg's answer - who has the meat of it; But, more succinctly:

set(json.dumps(i, sort_keys=True) for i in data)

where data is your list of dicts