3

I want to combine two lists of multiple dicts into a new list of dicts, appending new dicts to the final list, and adding together the 'views' values if encountered.

a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]

b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

And the desired output would be:

c = [{'title': 'Learning How to Program', 'views': 8,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 5,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

I found Is there any pythonic way to combine two dicts (adding values for keys that appear in both)? -- however I do not understand how to get the desired output in my situation, having two lists of multiple dicts.

Community
  • 1
  • 1
bhux
  • 43
  • 6

7 Answers7

2

You need to convert your input dictionaries to (title: count) pairs, using them as keys and values in a Counter; then after summing, you can convert these back to your old format:

from collections import Counter

summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
c = [{'title': title, 'views': counts} for title, counts in summed.items()]

Demo:

>>> from collections import Counter
>>> a = [{'title': 'Learning How to Program', 'views': 1},
...      {'title': 'Mastering Programming', 'views': 3}]
>>> b = [{'title': 'Learning How to Program', 'views': 7},
...      {'title': 'Mastering Programming', 'views': 2},
...      {'title': 'Programming Fundamentals', 'views': 1}]
>>> summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
>>> summed
Counter({'Learning How to Program': 8, 'Mastering Programming': 5, 'Programming Fundamentals': 1})
>>> [{'title': title, 'views': counts} for title, counts in summed.items()]
[{'views': 8, 'title': 'Learning How to Program'}, {'views': 5, 'title': 'Mastering Programming'}, {'views': 1, 'title': 'Programming Fundamentals'}]

The goal here is to have a unique identifier per count. If your dictionaries are more complex, you either need to convert the whole dictionary (minus the count) to a unique identifier, or pick one of the values from the dictionary to be that identifier. Then sum the view counts per identifier.

From your updated example, the URL would be a good identifier. That'd let you collect the view count in place:

per_url = {}
for entry in a + b:
    key = entry['url']
    if key not in per_url:
        per_url[key] = entry.copy()
    else:
        per_url[key]['views'] += entry['views']

c = per_url.values()  # use list(per_url.values()) on Python 3

This simply uses the dictionaries themselves (or at least a copy of the first one encountered) to sum the view counts:

>>> from pprint import pprint
>>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
>>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
>>> per_url = {}
>>> for entry in a + b:
...     key = entry['url']
...     if key not in per_url:
...         per_url[key] = entry.copy()
...     else:
...         per_url[key]['views'] += entry['views']
... 
>>> per_url
{'/93hB': {'url': '/93hB', 'title': 'Programming Fundamentals', 'slug': 'programming-fundamentals', 'views': 1}, '/4XvR': {'url': '/4XvR', 'title': 'Learning How to Program', 'slug': 'learning-how-to-program', 'views': 8}, '/7XqR': {'url': '/7XqR', 'title': 'Mastering Programming', 'slug': 'mastering-programming', 'views': 5}}
>>> pprint(per_url.values())
[{'slug': 'programming-fundamentals',
  'title': 'Programming Fundamentals',
  'url': '/93hB',
  'views': 1},
 {'slug': 'learning-how-to-program',
  'title': 'Learning How to Program',
  'url': '/4XvR',
  'views': 8},
 {'slug': 'mastering-programming',
  'title': 'Mastering Programming',
  'url': '/7XqR',
  'views': 5}]
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • My sincere apologies - I did not include all the data I have in the dicts - your solution sounds very close, but how can I deal with multiple key/value combinations in a single dict as now indicated in my question? – bhux May 12 '15 at 17:31
  • @bhux make a tuple key out of the remaning values; a namedtuple class could make it clearer. I'll see about an update when home. – Martijn Pieters May 12 '15 at 18:04
  • Ok thank you - I'm a little lost on how to accomplish what you are speaking to. I would humbly appreciate an example when you are home :) – bhux May 12 '15 at 18:16
  • Wow, except for the variable name, we came up with *identical* solutions. A good day for The Zen :-) – Stefan Pochmann May 13 '15 at 00:33
  • Thank you very much for your help on this Martijn! – bhux May 13 '15 at 07:59
  • @StefanPochmann: So we did! Sorry if it looks like I did a C&P job there, this really was a case of converging solutions. :-/ – Martijn Pieters May 13 '15 at 08:50
  • @MartijnPieters No worries, I did think we came up with it independently and I took it as a sign of me doing things right :-) – Stefan Pochmann May 13 '15 at 13:31
1

First, you need to convert your inputs into dicts, for example

b = {'Learning How to Program': 7,
     'Mastering Programming': 2,
     'Programming Fundamentals': 1}

After that, apply the solution you found, then convert it back to list of dicts.

Tuan Anh Hoang-Vu
  • 1,994
  • 1
  • 21
  • 32
1

Here's a simple one. Walks over all entries, copies an entry the first time it's encountered, and adds the views in subsequent encounters:

summary = {}    
for entry in a + b:
    key = entry['url']
    if key not in summary:
        summary[key] = entry.copy()
    else:
        summary[key]['views'] += entry['views']
c = list(summary.values())
Stefan Pochmann
  • 27,593
  • 8
  • 44
  • 107
0

It might may not be the most pythonic solution:

def coalesce(d1,d2):
    combined = [i for i in d1]
    for d in d2:
        found = False
        for itr in combined:          
            if itr['title'] == d['title']:
                itr['views'] += d['views']
                found = True
                break
        if not found:
             combined.append(d)
     return combined
Farmer Joe
  • 6,020
  • 1
  • 30
  • 40
0

Non-optimal, but works:

>>> from collections import Counter
>>> from pprint import pprint
>>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
>>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
>>> summed = sum((Counter({x['slug']: x['views']}) for x in a+b), Counter())
>>> c = dict()
>>> _ = [c.update({x['slug']: x}) for x in a + b]
>>> _ = [c[x].update({'views': summed[x]}) for x in c.keys()]
>>> pprint(c.values())
[{'slug': 'mastering-programming',
  'title': 'Mastering Programming',
  'url': '/7XqR',
  'views': 5},
 {'slug': 'programming-fundamentals',
  'title': 'Programming Fundamentals',
  'url': '/93hB',
  'views': 1},
 {'slug': 'learning-how-to-program',
  'title': 'Learning How to Program',
  'url': '/4XvR',
  'views': 8}]

Based on the Counter idea from Martijn with some more iterations to update the counter values with the other attributes, assuming they don't change.

Note that there are some "encrypted" loops in the generators...

Ronoaldo Pereira
  • 647
  • 7
  • 12
0

A simple function that does what you need for any given number of lists:

import itertools
from collections import Counter, OrderedDict

def sum_views(*lists):
    views = Counter()
    docs = OrderedDict()  # to preserve input order
    for doc in itertools.chain(*lists):
        slug = doc['slug']
        views[slug] += doc['views']
        docs[slug] = dict(doc)   # shallow copy of original dict
        docs[slug]['views'] = views[slug]
    return docs.values()
sirfz
  • 4,097
  • 23
  • 37
0

Assuming that you don't want to title it as "title" and "views". More professional way is to write it this way:

  def combing(x):
     result = {}
     for i in x:
        h = i.values()
        result[h[0]] = result.get(h[0],0)+ h[1]
     return result

combing([{'item': 'item1', 'amount': 400}, {'item': 'item2', 'amount': 
300}, {'item': 'item1', 'amount': 750}])
Mike
  • 458
  • 4
  • 13