30

I have a list:

d = [{'x':1, 'y':2}, {'x':3, 'y':4}, {'x':1, 'y':2}]

{'x':1, 'y':2} comes more than once I want to remove it from the list.My result should be:

 d = [{'x':1, 'y':2}, {'x':3, 'y':4} ]

Note: list(set(d)) is not working here throwing an error.

Shawn Chin
  • 84,080
  • 19
  • 162
  • 191
ramesh.c
  • 585
  • 1
  • 6
  • 6
  • 4
    `set()` will try to hash each element of the list you give it. A `dict` is not hashable in Python, which is why `set(d)` will throw a `TypeError` – Rodrigue Jun 08 '11 at 15:09
  • 2
    is it always just two element dicts? Avoid this whole problem and use tuples instead. – Jochen Ritzel Jun 08 '11 at 15:10
  • 1
    Possible duplicate of [Python - List of unique dictionaries](http://stackoverflow.com/questions/11092511/python-list-of-unique-dictionaries) – tripleee Jul 26 '16 at 06:14
  • 1
    @tripleee this is not duplicate. the one you pointed is using a single attribute in the dictionary which is unique. in this case there is no unique attribute. – orenma Jul 26 '16 at 11:04

6 Answers6

33

If your value is hashable this will work:

>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]

EDIT:

I tried it with no duplicates and it seemed to work fine

>>> d = [{'x':1, 'y':2}, {'x':3, 'y':4}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 4, 'x': 3}, {'y': 2, 'x': 1}]

and

>>> d = [{'x':1,'y':2}]
>>> [dict(y) for y in set(tuple(x.items()) for x in d)]
[{'y': 2, 'x': 1}]
GWW
  • 43,129
  • 11
  • 115
  • 108
  • 1
    This works only if the order of the items is guaranteed to be consistent (ie x always before y) across all the dicts. I know consistent order is guaranteed for a single dicts, but i'm not so sure in this case. – Jochen Ritzel Jun 08 '11 at 15:20
  • 1
    `x.iteritems()` -> `tuple(x.iteritems())` or you're comparing generator objects. – Jochen Ritzel Jun 08 '11 at 15:21
  • but if there is no duplicate element then this is gives me the wrong result. – ramesh.c Jun 08 '11 at 15:23
  • @Jochen Ritzel: I changed that back that was some pre-mature optimization thanks – GWW Jun 08 '11 at 15:24
  • @ramesh.c: I tried it with no duplicates and it seemed to work okay – GWW Jun 08 '11 at 15:27
  • @GWW; what about if I only try to match 'x' key. If 'x' value matches it is a duplicate. – ramesh.c Jun 08 '11 at 15:31
  • @ramesh That would need a different solution, and so you might get more focused answers if you ask that as a separate question. – Shawn Chin Jun 08 '11 at 15:44
  • this option will not work if the values of the dictionaries are mutable (e.g. another dict, list, set). in this case it works because the all the values are strings. – orenma Jul 26 '16 at 11:21
8

Dicts aren't hashable, so you can't put them in a set. A relatively efficient approach would be turning the (key, value) pairs into a tuple and hashing those tuples (feel free to eliminate the intermediate variables):

tuples = tuple(set(d.iteritems()) for d in dicts)
unique = set(tuples)
return [dict(pairs) for pairs in unique]

If the values aren't always hashable, this is not possible at all using sets and you'll propably have to use the O(n^2) approach using an in check per element.

7

Avoid this whole problem and use namedtuples instead

from collections import namedtuple

Point = namedtuple('Point','x y'.split())
better_d = [Point(1,2), Point(3,4), Point(1,2)]
print set(better_d)
Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
5

A simple loop:

tmp=[]

for i in d:
    if i not in tmp:
        tmp.append(i)        
tmp
[{'x': 1, 'y': 2}, {'x': 3, 'y': 4}]
Fredrik Pihl
  • 44,604
  • 7
  • 83
  • 130
  • 2
    5 lines vs. 1 (in GWW's answer) and not even equally readable... this is why terseness isn't evil in right doses. –  Jun 08 '11 at 15:17
  • 3
    actually it's only 4 (don't count the print). :-) Regarding the readability, well simplicity is in the eye of the beholder :-) – Fredrik Pihl Jun 08 '11 at 15:23
4

tuple the dict won't be okay, if the value of one dict item looks like a list.

e.g.,

data = [
  {'a': 1, 'b': 2},
  {'a': 1, 'b': 2},
  {'a': 2, 'b': 3}
]

using [dict(y) for y in set(tuple(x.items()) for x in data)] will get the unique data.

However, same action on such data will be failed:

data = [
  {'a': 1, 'b': 2, 'c': [1,2]},
  {'a': 1, 'b': 2, 'c': [1,2]},
  {'a': 2, 'b': 3, 'c': [3]}
]

ignore the performance, json dumps/loads could be a nice choice.

data = set([json.dumps(d) for d in data])
data = [json.loads(d) for d in data]
Eric
  • 271
  • 3
  • 5
0

Another dark magic(please don't beat me):

map(dict, set(map(lambda x: tuple(x.items()), d)))
Artsiom Rudzenka
  • 27,895
  • 4
  • 34
  • 52
  • 5
    `map` isn't dark magic, it's just an ugly way to write a list comprehension ;) But note that in Python 2, this is very ineficient as it creates several (three if I'm counting correctly) intermediate lists that totally aren't needed. –  Jun 08 '11 at 15:31
  • Ok, i agree that we have here several unneeded lists. However, as i can see all of the solutions published above are also using not a single list. But why map is an ugly way of list comprehension? – Artsiom Rudzenka Jun 08 '11 at 15:43
  • 1
    If by dark magic you mean obfuscated, then you could just as well have gone for `l=type;y,z,l,o=(map,l({}),set,l(()));y(z,l(o(x.items())for x in d))`. Nested map are generally not as intuitive to read compared to list comprehension. – Shawn Chin Jun 08 '11 at 15:51
  • @Artsimon: Mine for instance creates a tuple of `n` two-tuples from a generator, a set from an iterator and a single list containing `n` dictionaries. Yours is the same except that it doesn't use generators and iterators when it could. –  Jun 08 '11 at 15:57
  • 1
    @delnan: thank you for providing details. i am only starting my way in python(especially in understanding differences between iterators and generator), so any comments are very appreciated and always helpful. – Artsiom Rudzenka Jun 08 '11 at 16:00