Removing duplicate keys from list of dictionary, keep only that key-value where value is maximum

Question

From a list like :

mylist = [{'x':2020 , 'y':20},{'x':2020 , 'y':30},{'x':2021 , 'y':10},{'x':2021 , 'y':5}]

I want to keep all 'x' unique and 'y' to be the maximum where 'x' is the same.

I am trying to get the output as:

mylist_unique =  [{'x':2020 , 'y':30},{'x':2021 , 'y':10}]

I have implemented it in a very naive way:

res =[]
temp = {}
print(len(temp))

for i in range(len(mylist)):
    print(mylist[i])
    for k,v in mylist[i].items():
        print(mylist[i]['x'],temp.keys(),mylist[i]['y'])
        if mylist[i]['x'] not in temp.keys() or mylist[i]['y'] > (temp[mylist[i]['x']]) :
            print(k)
            temp.update({mylist[i]['x']:mylist[i]['y']})

print(temp)
for k,v in temp.items():
    res.append({'x':k,'y':v})
print(res)

score 5 · Accepted Answer · answered Sep 17 '20 at 08:44

5

You can use a dict comprehension with itertools.groupby:

from itertools import groupby

mylist = [{'x': 2020, 'y': 20}, {'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}, {'x': 2021, 'y': 5}]

mylist_unique = [{'x': key, 'y': max(item['y'] for item in values)}
                 for key, values in groupby(mylist, lambda dct: dct['x'])]
print(mylist_unique)

This yields

[{'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}]

answered Sep 17 '20 at 08:44

Jan

42,290
8
54
79

what is lambda dct: dct['x']) – richa verma Sep 17 '20 at 08:54
@richaverma: This is how groupby works. You can feed it a key function that accumulates the values based on a key (in your example `x` from every dict in your list). – Jan Sep 17 '20 at 08:56
1

@richaverma The keyword `lambda` is used to define anonymous inline functions. `groupby`'s second argument must be a function. You could define that function beforehand by writing `def function_for_groupby(dct): return dct['x']` and then `groupby(mylist, function_for_groupby)` or you can define the function inline without giving it a name, using `lambda`. See also: https://stackoverflow.com/questions/5233508/what-exactly-is-lambda-in-python – Stef Sep 17 '20 at 09:21
what is passed in argument during function call? – richa verma Sep 17 '20 at 09:46
How would this code look if I mad multiple keys in a dict and I would like to retain all of them but filter by specifc one only? – marcin2x4 Oct 18 '22 at 13:26

score 1 · Answer 2 · answered Sep 17 '20 at 08:35

Try this piece of code, should do what you look for:

ht = dict()
for elem in mylist:
    if elem['x'] in ht:
        ht[elem['x']] = max(ht[elem['x']],elem['y'])
    else:
        ht[elem['x']]=elem['y']

mylist_unique=[]

for key in ht:
    mylist_unique.append({'x':key,'y':ht[key]})

score 1 · Answer 3 · answered Sep 17 '20 at 08:38

A simple groupby should work for you as you want unique x we groupby on x and then find the max value of y

import itertools
mylist = [{'x':2020 , 'y':20},{'x':2020 , 'y':30},{'x':2021 , 'y':10},{'x':2021 , 'y':5}]
mylist1=[]
for key, group in itertools.groupby(mylist,lambda x:x["x"]):
    max_y=0
    for thing in group:
        max_y=max(max_y,thing["y"])
    mylist1.append({"x":key,"y":max_y})
print(mylist1)

score 1 · Answer 4 · answered Sep 17 '20 at 08:55

1

You can use map and groupby to do this in one line:

from itertools import groupby

list(map(lambda a:{'x':a[0], 'y':max(map(lambda b: b['y'], a[1]))}, groupby(mylist, lambda c: c['x'])))

This yields

[{'x': 2020, 'y': 30}, {'x': 2021, 'y': 10}]

answered Sep 17 '20 at 08:55

A Co

908
6
15

Removing duplicate keys from list of dictionary, keep only that key-value where value is maximum

4 Answers4