-4

Assume I have this:

[
    {"name": "bob", "total": 1},
    {"name": "alice", "total": 5},
    {"name": "eve", "total": 2},
    {"name": "bob", "total": 3},
    {"name": "alice", "total": 2},
    {"name": "alice", "total": 2},
]

I want to transform this list into :

[
    {"name": "bob", "total": 4},
    {"name": "alice", "total": 9},
    {"name": "eve", "total": 2}
]

For now, I walk through the whole second list to find if the key exist for each loop of the first list.

How can I achieve this with a lower complexity?

Antoine
  • 130
  • 2
  • 12
  • 1
    Could you show your current code, please? You can do this in O(n) by using a dict that maps key names to value totals. See [this answer](https://stackoverflow.com/a/8528952/6243352) – ggorlen Jun 09 '21 at 15:22
  • great answer here: https://stackoverflow.com/questions/8653516/python-list-of-dictionaries-search – The shape Jun 09 '21 at 15:26

4 Answers4

1

If you only have two pieces of information (name and total), I would suggest changing your schema a bit. Instead of a list of dictionaries, use a single dictionary where the keys are names and the values are totals:

>>> values = [
...     {"name": "bob", "total": 1},
...     {"name": "alice", "total": 5},
...     {"name": "eve", "total": 2},
...     {"name": "bob", "total": 3},
...     {"name": "alice", "total": 2},
...     {"name": "alice", "total": 2},
... ]
>>> from collections import defaultdict
>>> totals_by_name = defaultdict(int)
>>> for value in values:
...     totals_by_name[value["name"]] += value["total"]
... 
>>> totals_by_name
defaultdict(<class 'int'>, {'bob': 4, 'alice': 9, 'eve': 2})

This can work even if you have more pieces of data that you want to look up by name (replace the integer value with a nested dictionary that stores the total as well as other data).

Andrew Eckart
  • 1,618
  • 9
  • 15
1
from collections import defaultdict

a = [
    {"name": "bob", "total": 1},
    {"name": "alice", "total": 5},
    {"name": "eve", "total": 2},
    {"name": "bob", "total": 3},
    {"name": "alice", "total": 2},
    {"name": "alice", "total": 2},
]

# calculate the frequency of each key
freq = defaultdict(lambda: 0)
for d in a:
    freq[d['name']] += d['total']

# build the result list
a = list()
for key, val in freq.items():
    a.append({'name': key, 'total': val})
print(a)
Sudipto
  • 299
  • 2
  • 11
1

You can use groupby from the itertools module:

from itertools import groupby
from operator import itemgetter

# itemgetter(foo) is roughly equivalent to lambda x: x[foo]
get_name = itemgetter('name')
get_total = itemgetter('total')

lst = [
    {"name": "bob", "total": 1},
    {"name": "alice", "total": 5},
    {"name": "eve", "total": 2},
    {"name": "bob", "total": 3},
    {"name": "alice", "total": 2},
    {"name": "alice", "total": 2},
]
grouped = groupby(sorted(lst, key=get_name), get_name)
new_list = [{'name': k, 'total': sum(get_total(x) for x in v)} for k, v in grouped]

groupby will produce a new sequence that collects the dicts from the original list into subsequences, based on a common value of the 'name' attribute. Iterating over that lets you extract all the total values to sum up for use in a new list of dict values.

chepner
  • 497,756
  • 71
  • 530
  • 681
0

Let's say,

your_data = [
    {"name": "bob", "total": 1},
    {"name": "alice", "total": 5},
    {"name": "eve", "total": 2},
    {"name": "bob", "total": 3},
    {"name": "alice", "total": 2},
    {"name": "alice", "total": 2},
]

You can simply use pandas to receive the desired output.

import pandas as pd

df = pd.DataFrame(your_data)    
df = df.groupby(by = 'name', as_index = False).sum('total')
result = df.to_dict(orient = 'records')

OUTPUT: [{'name': 'alice', 'total': 9}, {'name': 'bob', 'total': 4}, {'name': 'eve', 'total': 2}]

helloWORLD
  • 135
  • 5