Python - Find duplicates in list and group them by key

Question

I have a list of python dicts like this:

[{
    'id': 1,
    'name': 'name1'
}, {
    'id': 2,
    'name': 'name2'
}, {
    'id': 3,
    'name': 'name1'
}]

What I want to do is to create a new list of dictionaries, containing only the ones that have the key 'name' duplicated, and group them.

[{
    'id1': 1,
    'id2': 3,
    'name': 'name1'
}]

The first list is an SQL query output and I need to delete the rows that have the key 'name' duplicated, keeping only one.

score 1 · Accepted Answer · answered Sep 18 '18 at 15:04

You can use itertools.groupby:

import itertools
d = [{'id': 1, 'name': 'name1'}, {'id': 2, 'name': 'name2'}, {'id': 3, 'name': 'name1'}]
new_data = [[a, list(b)] for a, b in itertools.groupby(sorted(d, key=lambda x:x['name']), key=lambda x:x['name'])]
final_dicts = [{'name':a, **{f'id{i}':a['id'] for i, a in enumerate(b, 1)}} for a, b in new_data if len(b) > 1]

Output:

[{'name': 'name1', 'id1': 1, 'id2': 3}]

score 0 · Answer 2 · answered Sep 18 '18 at 16:12

I suggest you the following solution, quite easy to read and understand:

from collections import defaultdict

ds = [{'id': 1, 'name': 'name1'},
     {'id': 2, 'name': 'name2'},
     {'id': 3, 'name': 'name1'}]

newd = defaultdict(list)

for d in ds:
    newd[d['name']].append(d['id'])
# Here newd is {'name1': [1, 3], 'name2': [2]}

result = []
for k,v in newd.items():
    if len(v) > 1:
        d = {f'id{i}':i for i in v}
        d['name'] = k
        result.append(d)

print(result)  # [{'id1': 1, 'id3': 3, 'name': 'name1'}]

score -2 · Answer 3 · answered Sep 18 '18 at 15:33

You can use collections.Counter:

from collections import Counter
from operator import itemgetter
l = [{'id': 1, 'name': 'name1'}, {'id': 2, 'name': 'name2'}, {'id': 3, 'name': 'name1'}]
print([{'name': n, **{'id%d' % i: d['id'] for i, d in enumerate([d for d in l if d['name'] == n], 1)}} for n, c in Counter(map(itemgetter('name'), l)).items() if c > 1])

This outputs:

[{'name': 'name1', 'id1': 1, 'id2': 3}]

Python - Find duplicates in list and group them by key

3 Answers3

Linked