0

I am trying to find the common items in two dictionaries for each key. Maybe the code below can explain my purpose better. I have too many records and thus this code takes a very long time to run; any way to write it efficiently?

rcmd = {'1':{"A","B"},"2":{"A","C"},"3":{"B","C","D"}}   
rmv = {'1':{"C","B"},"2":{"A","C"},"3":{"B","C","A"},"4":{"A"}}

correct_rcmd = []
for i in range(len(rcmd)):
    for j in range(len(rmv)):
        if rcmd.keys()[i] == rmv.keys()[j]:
            correct_rcmd.append(rcmd.values() 
[i].intersection(rmv.values()[j]))
print correct_rcmd
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69

1 Answers1

0

You should never use indexes into keys of dictionarys - especially not if you use python 2.x (as suggested by your use of print without any () ).

The order of keys in dictionarys is not fixed up until 3.6 (only for CPython as sideeffect of an implementation detail) or from python 3.7 on by default.

The order is then insert order - if you insert keys in different order, indexing into keys() will still break because you are mixing values of different keys!.

rcmd = {'1':{"A","B"},"2":{"A","C"},"3":{"B","C","D"}}   
rmv = {'1':{"C","B"},"2":{"A","C"},"3":{"B","C","A"},"4":{"A"}}

# get all keys that are common in both dictionaries
# python 2 or 3:
keys_that_are_same = set(rcmd) & set(rmv)
# for python 3 better, as it uses the key-sets directly: 
# keys_that_are_same = rcmd.keys() & rmv.keys()

# loop over both keys and get the intersection into a new dict:
common = {}
for key in keys_that_are_same:
    common[key] = rcmd[key] & rmv[key]

# as dict comprehension - no loop needed: 
# common = {k : rcmd[k] & rmv[k] for k in keys_that_are_same} 

print(common)

Output:

{'1': set(['B']), '3': set(['C', 'B']), '2': set(['A', 'C'])} # py 2.7

{'1': {'B'}, '2': {'A', 'C'}, '3': {'C', 'B'}}                # py 3.6

This also has the benefit of using less objects that need to be constructed and handled as well as using sets to reduce the amount of keys to handle to begin with. If you are on python 2.

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
  • Why are you forcing things into a list (which has to be created first and destroyed later) to test if someting it the list (wich takes O(n) instead of O(1) for sets) == it takes faaar longer... I do not _feel it_ - at all. It is suboptimal. Will it work for 100 items: _sure_ - will it work for 10 Millions yes, but it takes ages if the item you are looking for is the last one in the list. – Patrick Artner Nov 30 '18 at 08:56