Group a large list of dictionaries by key value in django

Question

I have a list of dictionaries as follows

[{'grade': '1', 'past_student_sum': 1611}, 
 {'grade': '2', 'past_student_sum': 1631}, 
 {'grade': '3', 'past_student_sum': 1598}, 
 {'grade': '1', 'current_student_sum': 1611}, 
 {'grade': '2', 'current_student_sum': 1631}, 
 {'grade': '3', 'current_student_sum': 1598}]

I got this list by combining 2 query sets in the following fashion:

grade_list = list(past_enrollments) + list(current_enrollments)

Is there a better alternatives to combine these in such a way to get a list that looks like this:

[{'grade': '1', 'past_student_sum': 1611, 'current_student_sum': 1621},
 {'grade': '2', 'past_student_sum': 1511, 'current_student_sum': 1521}]

is it a coincidence that the `'past_student_sum'` and `'current_student_sum'` **always** have the same value for corresponding grades? — Ma0, Mar 01 '18 at 15:52
Yes. This is just dummy data. But in reality sometimes they will be the same, other times they will be different. — Hanny, Mar 01 '18 at 15:52

score 6 · Accepted Answer · answered Mar 01 '18 at 15:56

6

Instead of building a list of dictionaries from past_enrollments and current_enrollments, I would instead build another dictionary using the grade value as a key. The easiest way to do this would probably be with a defaultdict

from collections import defaultdict
from itertools import chain

grades = defaultdict(dict)

for d in chain(past_enrollments, current_enrollments):
    grades[d['grade']].update(d)

Then our finished dictionaries are just the values of that dictionary

grades = list(grades.values())
print(grades)
# [{'grade': '1', 'past_student_sum': 1611, 'current_student_sum': 1611}, 
#  {'grade': '2', 'past_student_sum': 1631, 'current_student_sum': 1631}, 
#  {'grade': '3', 'past_student_sum': 1598, 'current_student_sum': 1598}]

answered Mar 01 '18 at 15:56

Patrick Haugh

59,226
13
88
96

1

what are `past_enrollments` and `current_enrollments` in your snippet? – Ma0 Mar 01 '18 at 16:01
1

@Ev.Kounis Whatever they are in the question? Some iterables containing dictionaries, presumably. – Patrick Haugh Mar 01 '18 at 16:01
1

Oh, I missed that. In that case, this is my favorite answer! Nice – Ma0 Mar 01 '18 at 16:02

Erik Cederstrand · Answer 2 · 2018-03-01T15:59:36.910

2

Here's one solution using a dict to group and merge the records by grade:

from collections import defaultdict

grade_map = defaultdict(dict)
for grade_info in grade_list:
    grade_map[grade_info['grade']].update(grade_info)
print(list(grade_map.values()))

edited Mar 01 '18 at 15:59

answered Mar 01 '18 at 15:52

Erik Cederstrand

9,643
8
39
63

Did you get the desired result when you tried running this? – Ma0 Mar 01 '18 at 15:52
Nah, I just realized this is not enough – Erik Cederstrand Mar 01 '18 at 15:53
Updated the example now – Erik Cederstrand Mar 01 '18 at 16:00

score 1 · Answer 3 · answered Mar 01 '18 at 15:51

This might help.

# -*- coding: utf-8 -*-

d = [{'grade': '1', 'past_student_sum': 1611},
 {'grade': '2', 'past_student_sum': 1631},
 {'grade': '3', 'past_student_sum': 1598},
 {'grade': '1', 'current_student_sum': 1611},
 {'grade': '2', 'current_student_sum': 1631},
 {'grade': '3', 'current_student_sum': 1598}]

e = {}
for i in d:
    if i["grade"] not in e:
        e[i["grade"]] = i
    else:
        if i.get("current_student_sum", None):
            e[i["grade"]].update({"current_student_sum": i["current_student_sum"]})

print [i[1] for i in e.items()]

Output:

[{'grade': '1', 'current_student_sum': 1611, 'past_student_sum': 1611}, {'grade': '3', 'current_student_sum': 1598, 'past_student_sum': 1598}, {'grade': '2', 'current_student_sum': 1631, 'past_student_sum': 1631}]

Abdul Niyas P M · Answer 4 · 2018-03-01T16:06:54.243

This could help you.

your_list = [
             {'grade': '1', 'past_student_sum': 1611},
             {'grade': '2', 'past_student_sum': 1631},
             {'grade': '3', 'past_student_sum': 1598},
             {'grade': '1', 'current_student_sum': 1611},
             {'grade': '2', 'current_student_sum': 1631},
             {'grade': '3', 'current_student_sum': 1598}
             ]



from itertools import groupby

result = []
key_func = lambda x: x['grade']

for i, j in groupby(sorted(your_list, key=key_func), key=key_func):
    group = {}
    for k in j:
        group.update(k)
    result.append(group)

print(result)
# [{'grade': '1', 'current_student_sum': 1611, 'past_student_sum': 1611}, {'grade': '2', 'current_student_sum': 1631, 'past_student_sum': 1631}, {'grade': '3', 'current_student_sum': 1598, 'past_student_sum': 1598}]

tommy.carstensen · Answer 5 · 2018-03-01T16:12:30.807

1

I prefer the answer by Patrick myself. Are you allowed to use pandas? Then you can use groupby and to_dict. Also needed are sum and reset_index.

import pandas as pd
df = pd.DataFrame(grade_list).groupby('grade').sum().reset_index().to_dict('records')

edited Mar 01 '18 at 16:12

answered Mar 01 '18 at 16:03

tommy.carstensen

8,962
15
65
108

score -1 · Answer 6 · answered Mar 01 '18 at 16:33

import itertools

/////


i = 0
for item1, item2 in itertools.izip_longest(listone,listtwo):
    listthree[i] = dict(item1, **item2)
    i += 1

I'd recommend having a look at itertools though as there are more efficient ways of doing this and there may be a method for this.

Also this is assuming the lists are the same size.

See the answer below for more on iterating

How to iterate through two lists in parallel?

https://stackoverflow.com/a/1663826/5990760

Group a large list of dictionaries by key value in django

6 Answers6