2

Edit: I need a solution which works with lists of dictionaries (dictionaries are not hashable), so this is not a duplicate of Compute list difference


I have two lists of dictionaries. They contain gmail labels / filters. I need to get differences between them in form of two differences lists. I want order of the elements to be preserved.

Example:

a=[
    {'name': 'label_a'},
    {'color': {'background': '#42d692', 'text': '#094228'}, 'name': 'label_b'},
    {'name': 'label_d'},
    {'name': 'label_e'},
]

b=[
    {'name': 'label_a'},
    {'name': 'label_b'},
    {'name': 'label_c'},
    {'name': 'label_d'},
]

Expected results:

a_diff=[
    {'color': {'background': '#42d692', 'text': '#094228'}, 'name': 'label_b'},
    {'name': 'label_e'},
]

b_diff=[
    {'name': 'label_b'},
    {'name': 'label_c'},
]

This is how I could do this, but I think it might be inefficient. Can I do something better?

a_diff=[]
b_diff=[]

for i in a:
    if i not in b:
        a_diff.append(i)

for j in b:
    if j not in a:
        b_diff.append(j)

I don't need to keep original lists after this operation, so this can be in-place. Speed is most important. So this question can be also rephrased as: "How to remove common elements between two lists from those lists?"

When you provide solution, let know if it would work also on any two lists or only lists of dictionaries.

Also please let know if your solution will remove duplicates or not (I am interested in both kinds of solutions):

Example:

a=[
    {'name': 'label_a'},
    {'name': 'label_a'},
    {'name': 'label_b'},
]

b=[
    {'name': 'label_a'},
    {'name': 'label_b'},
    {'name': 'label_c'},
]

Results with duplicates removed:

a_diff=[
]

b_diff=[
    {'name': 'label_c'},
]

Results with duplicates not removed:

a_diff=[
    {'name': 'label_a'},
]

b_diff=[
    {'name': 'label_c'},
]
Karol Zlot
  • 2,887
  • 2
  • 20
  • 37
  • You need to use a hash-based approach (i.e. use a `set` instead of a list) if you want any hope for good performance. You are going to have to choose some other representation, because `dict` objects are not hashable. – juanpa.arrivillaga Jun 23 '21 at 00:06
  • If you get those results, how will you know which of the `a_dict` elements were the ones that differed? You only want to find the elements that are not shared between the two? Or what? – Karl Knechtel Jun 23 '21 at 01:07
  • @KarlKnechtel Yes, only the elements that are not shared between the two lists. – Karol Zlot Jun 23 '21 at 01:10

1 Answers1

2

this is a little bit of a weird approach, but it does the trick:

def make_hash(list_of_dicts):
    return {str(d): i for i, d in enumerate(list_of_dicts)}


def ordered_diff(d1, d2):
    return [eval(d) for d in sorted(d1.keys() - d2, key=d1.get)]


def get_diff(l1, l2):
    d1, d2 = make_hash(l1), make_hash(l2)
    return ordered_diff(d1, d2), ordered_diff(d2, d1)
acushner
  • 9,595
  • 1
  • 34
  • 34