1

I have a list of dictionary's that are similar but not completely identical and I want to keep one of them

example:

my_list = [
{"name" : "A","id" : 2,"value" : 279},
{"name" : "A","id" : 3,"value" : 463},
{"name" : "B","id" : 8,"value" : 508},
{"name" : "A","id" : 2,"value" : 647},
{"name" : "A","id" : 2,"value" : 969},
{"name" : "C","id" : 5,"value" : 384}]

I want to remove the dictionary's that share "name" and "id" but keep the one with higher "value

example of what I want it to be like

my_list = [
{"name" : "A","id" : 3,"value" : 463},
{"name" : "B","id" : 8,"value" : 508},
{"name" : "A","id" : 2,"value" : 969},
{"name" : "C","id" : 5,"value" : 384}]

the values that got removed are

{"name" : "A","id" : 2,"value" : 279},
{"name" : "A","id" : 2,"value" : 647}

because {"name" : "A","id" : 2,"value" : 969} have more "value"

{"name" : "A","id" : 3,"value" : 463} didn't get removed because the "id" is different

how can i do that?

i tried looking at some questions like

How to remove duplicate elements of, list of dictionaries in python

python_user
  • 5,375
  • 2
  • 13
  • 32
DeCErlET
  • 19
  • 1

5 Answers5

3

Try:

my_list = [
    {"name": "A", "id": 2, "value": 279},
    {"name": "A", "id": 3, "value": 463},
    {"name": "B", "id": 8, "value": 508},
    {"name": "A", "id": 2, "value": 647},
    {"name": "A", "id": 2, "value": 969},
    {"name": "C", "id": 5, "value": 384},
]

out = {}
for d in sorted(my_list, key=lambda k: k["value"]):
    out[(d["name"], d["id"])] = d

print(list(out.values()))

Prints:

[
    {"name": "A", "id": 2, "value": 969},
    {"name": "C", "id": 5, "value": 384},
    {"name": "A", "id": 3, "value": 463},
    {"name": "B", "id": 8, "value": 508},
]
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • 1
    I knew this question would get a load of answers immediately! I definitely prefer this solution though: no extra imports, short, and easily understandable. – Robson Aug 14 '21 at 12:35
2
import itertools

my_list.sort(key=lambda d: (d["name"], d["id"], -d["value"]))

for _key, group in itertools.groupby(
  my_list,
  key=lambda d: (d["name"], d["id"])
):
  print(next(group))
Alex Hall
  • 34,833
  • 5
  • 57
  • 89
2

see below

from collections import defaultdict

my_list = [
    {"name": "A", "id": 2, "value": 279},
    {"name": "A", "id": 3, "value": 463},
    {"name": "B", "id": 8, "value": 508},
    {"name": "A", "id": 2, "value": 647},
    {"name": "A", "id": 2, "value": 969},
    {"name": "C", "id": 5, "value": 384}]

data = defaultdict(list)
for entry in my_list:
    data[entry['name'], entry["id"]].append(entry)
new_data = []
for k, v in data.items():
    new_data.append(max(v, key=lambda x: x['value']))
print(new_data)

output

[{'name': 'A', 'id': 2, 'value': 969}, {'name': 'A', 'id': 3, 'value': 463}, {'name': 'B', 'id': 8, 'value': 508}, {'name': 'C', 'id': 5, 'value': 384}]
balderman
  • 22,927
  • 7
  • 34
  • 52
1

If you don't mind using pandas (a little overkill) you can create a DataFrame, sort by value and then drop duplicates on just name and id.

import pandas as pd

df = pd.DataFrame(my_list)
out_list = (
    df.sort_values("value", ascending=False)
    .drop_duplicates(["name", "id"], keep="first")
    .to_dict(orient="records")
)

Which outputs:

[{'name': 'A', 'id': 2, 'value': 969},
 {'name': 'B', 'id': 8, 'value': 508},
 {'name': 'A', 'id': 3, 'value': 463},
 {'name': 'C', 'id': 5, 'value': 384}]
Alex
  • 6,610
  • 3
  • 20
  • 38
0

First sort them such that all same (name, id) are adjacent, then from each group take the max value element

from itertools import groupby
from operator import itemgetter
my_list = [
{"name" : "A","id" : 2,"value" : 279},
{"name" : "A","id" : 3,"value" : 463},
{"name" : "B","id" : 8,"value" : 508},
{"name" : "A","id" : 2,"value" : 647},
{"name" : "A","id" : 2,"value" : 969},
{"name" : "C","id" : 5,"value" : 384}]
# sorting priority 1 key is name,priority 2 key is its id
my_list.sort(key=itemgetter('name', 'id'))

# from each name, id group take largest value element
[max(g, key=itemgetter('value')) for _, g in groupby(my_list, itemgetter('name', 'id'))]
eroot163pi
  • 1,791
  • 1
  • 11
  • 23