With the data structures as given, you'd have to repeatedly iterate through the items in your second list of dictionaries, which is relatively inefficient. All you care about is whether a given phone number already exists in the second list of dictionaries. The most efficient data structure for repeatedly testing whether or not a given value is present is a set
(or a dict
if you might need to index from phone numbers back to further information). So I would do this as the following:
a = [a1, a2, a3]
b = [b1, b2, b3]
a_phone_numbers_set = set(d['phone'] for d in a])
b_phone_numbers_set = set(d['phone'] for d in b])
result_A_minus_B = [d for d in a if d['phone'] not in b_phone_numbers_set]
result_B_minus_A = [d for d in b if d['phone'] not in a_phone_numbers_set]
Or, if I wanted to create a function:
def unmatched_entries(list1, list2):
existing_entries = set(d['phone'] for d in list2)
return [d for d in list1 if d['phone'] not in existing_entries]
Optionally, you could use an arbitrary key:
def unmatched_entries(list1, list2, matching_key):
existing_entries = set(d[matching_key] for d in list2 if matching_key in d)
return [d for d in list1 if matching_key in d and d[matching_key] not in existing_entries]
That version always skips entries from list1 that don't define the requested key - other behavior is possible.
To match on multiple keys as alluded to by a briefly appearing comment, I would use a set of tuples of the values:
a_match_elements = set((d['phone'], d['email']) for d in a])
result_B_minus_a = [d for d in b if (d['phone'], d['email']) not in a_match_elements]
Again, this could be generalized to handle a sequence of keys.