1

I have a list in python like this

my_list = [{'1':'A','2':'B'}]

And now I want to append some more JSON to my_list but first I want to check if the JSON I am adding already exists, how could I go about this?

So if I was going to append {'2':'B','1':'A'}, I would that not to be added?

How could I do this?

Thanks

spen123
  • 3,464
  • 11
  • 39
  • 52

2 Answers2

2

You can check if two dictionaries are identical with ==

In [2]: {'1':'A','2':'B'}=={'2':'B','1':'A'}
Out[2]: True

Therefore to check if JSON exists in my_list you can simply do

if JSON in my_list:
    #blahblah

Update:

To use set with your data, you can define your own subclass and implement __hash__() method. You can start from here:

class MyJSON(dict):
    def __hash__(self):
        return hash(json.dumps(self,sort_keys=True))

Example:

a=MyJSON({'1':'A','2':'B'})
b=MyJSON({'1':'A','2':'C'})
c=MyJSON({'2':'B','1':'A'}) ## should be equal to a
print a==c # should be True
my_set=set()
my_set.add(a)
my_set.add(b)
my_set.add(c)
for item in my_set:
    print item,
## output is {'1': 'A', '2': 'C'} {'1': 'A', '2': 'B'}
Shaoran Hu
  • 56
  • 5
  • yes but as the array get longer this doesn't work, unless i add a loop to go through each one, which I am trying to avoid – spen123 Oct 30 '15 at 17:22
  • 1
    @spenf10 but `in` are supposed to act similarly to looping for lists? Have you checked that the new `JSON` you are appending is 100% identical to an old one? Anyway if you are having a long list and need to check everytime maybe you should consider `set` as some answers above. – Shaoran Hu Oct 30 '15 at 17:27
  • right but a `set` doesn't allow me to add a dictionary to it – spen123 Oct 30 '15 at 17:29
  • @spenf10 Yeah... Sorry I have just realised that. How about doing `json.dumps(JSON, sort_keys=True)` to convert your data to string then? I know it will cost a lot but in some cases this might be the best way to do. – Shaoran Hu Oct 30 '15 at 17:33
  • @spenf10 I found a way to support `set`, see my update – Shaoran Hu Oct 30 '15 at 17:55
  • This solution looks very nice to me – SomethingSomething Oct 30 '15 at 23:21
0

The canonical data structure to use for avoiding duplicates is the set. As you've mentioned, you can't do this because you'll be adding dicts to the set, which are unhashable.

The usual fix for this is either to define a custom dict-like object that is hashable, or to freeze your dict into something hashable and add that to the set. We'll do the latter.

my_list = [{1:2, 3:4}, {3:4, 1:2}]
result = set()

for json_data in my_list:
    result.add(frozenset(json_data.items()))

print(result)
# {frozenset({(1, 2), (3, 4)})}
Adam Smith
  • 52,157
  • 12
  • 73
  • 112