2

I searched for some time, but couldn't find an exact solution to my problem. I have a list of dictionaries in format:

d = [{"sender": "a", "time": 123, "receiver": "b", "amount": 2}, {"sender": "c", "time": 124, "receiver": "b", "amount": 10}, {"sender": "a", "time": 130, "receiver": "b", "amount": 5}]

I would like to find the best way to iterate over all the dictionaries and count how many times a given pair of sender-receiver occurs and the sum of the total amount.

So I would like to get:

result = [{"sender": "a", "receiver":b, "count": 2, "total_amount":7}, {"sender": "c", "receiver":b, "count": 1, "total_amount":10}]

I am pretty sure I can probably make this work by iterating over all the dictionaries in the list one by one, saving the information in a temporary dictionary, but that will lead to a lot of nested if loops. I was hoping there is a cleaner way to do this.

I know I can use Counter to count the number of occurences for a unique value:

from collections import Counter
Counter(val["sender"] for val in d)

which will give me:

>>> ({"a":2, "c":1})

but how can I do this for a pair of values and have separate dictionaries for each?

Thank you in advance and I hope my question was clear enough

Koen G.
  • 738
  • 5
  • 10
Georgi Nikolov
  • 113
  • 3
  • 11

6 Answers6

2

Pure python way is to create a new hash table of sender:reciever pairs

I UPDATED it to count the total amount as requested as well.

d = [{"sender": "a", "time": 123, "reciever": "b", "amount": 2},
     {"sender": "c", "time": 124, "reciever": "b", "amount": 10},
     {"sender": "a", "time": 130, "reciever": "b", "amount": 5}]

nd = {}

for o in d:
  sender = o['sender']
  recv = o['reciever']
  amount = o['amount']

  k = sender + ":" + recv
  if k not in nd:
    nd[k] = (0, 0)

  nd[k] = (nd[k][0] + 1, nd[k][1] + amount)

print nd

which results in {'c:b': (1, 10), 'a:b': (2, 7)}

ege
  • 774
  • 5
  • 19
  • Thank you for the answer. As I mentioned under the answer of @Rakesh that one is the closest to what I want as i want to write the output to json and be able to display it using javascript on a website so I would like to have dictionary with all the keys present. But your answer is also very helpful if i need to do such types of operations again on dictionaries. – Georgi Nikolov Nov 24 '19 at 10:27
2

This is one approach using a simple iteration with dict methods.

Ex:

d = [{"sender": "a", "time": 123, "reciever": "b", "amount": 2}, {"sender": "c", "time": 124, "reciever": "b", "amount": 10}, {"sender": "a", "time": 130, "reciever": "b", "amount": 5}]
result = {}
for i in d:
    key = (i['sender'], i['reciever'])
    # del i['time']  # if you do not need `time` key
    if key not in result:
        i.update({'total_amount': i.pop('amount'), 'count': 1})
        result[key] = i
    else:
        result[key]['total_amount'] += i['amount']
        result[key]['count'] += 1

print(list(result.values()))

Output:

[{'count': 2, 'reciever': 'b', 'sender': 'a', 'time': 123, 'total_amount': 7},
 {'count': 1, 'reciever': 'b', 'sender': 'c', 'time': 124, 'total_amount': 10}]
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • @Jan I did not use it because of multiple key-value modifications – Rakesh Nov 22 '19 at 13:58
  • Plus one for getting his actual expected result down – ege Nov 22 '19 at 14:00
  • as @ege said this is the result I want to get as I want to afterwards be able to write the output to a json file to be displayed using javascript on a website. I will count this as the real answer but ege 's answer is also very useful – Georgi Nikolov Nov 24 '19 at 10:26
1

You could use pandas to parse the list of dictionaries into a dataframe.
The dataframe would allow you to easily sum over the amount field for certain sender receiver pairs.

import pandas as pd

dict = [{"sender": "a", "time": 123, "receiver": "b", "amount": 2},   
        {"sender": "c", "time": 124, "receiver": "b", "amount": 10},   
        {"sender": "a", "time": 130, "receiver": "b", "amount": 5}]

df = pd.DataFrame.from_records(dict)
group = df.groupby(by=['sender', 'receiver'])

result = group.sum()
result['occurrences'] = group.size()
print(result)

will output

                 time  amount  occurrences
sender receiver
a      b          253       7            2
c      b          124      10            1
Max Crous
  • 411
  • 2
  • 11
1

Max Crous's answer is more elegant than this, but in case you'd like to avoid extra libraries: this is a pure python way:

import collections
result = collections.defaultdict(lambda : [0,0])
for e in d: 
    result[(e['sender'],e['reciever'])][0]+=e['amount']
    result[(e['sender'],e['reciever'])][1]+=1 

Result is now a dictionary with tuples of sender and reciever as keys and 2-element lists [total_amount, count] as values

Koen G.
  • 738
  • 5
  • 10
0

Imo the easiest and cleanest solution would be to use a defaultdict:

from collections import defaultdict

dct = [{"sender": "a", "time": 123, "reciever": "b", "amount": 2},
       {"sender": "c", "time": 124, "reciever": "b", "amount": 10},
       {"sender": "a", "time": 130, "reciever": "b", "amount": 5}]

result = defaultdict(int)
for item in dct:
    key = "{}:{}".format(item["sender"], item["reciever"])
    result[key] += item["amount"]

print(result)

Which results in

defaultdict(<class 'int'>, {'a:b': 7, 'c:b': 10})

Besides, don't call your variables dict or list.

Jan
  • 42,290
  • 8
  • 54
  • 79
0

using dictionary you can set sender as key and values as receiver and amount and then increment/add receiver,amount

dict = [{"sender": "a", "time": 123, "reciever": "b", "amount": 2}, {"sender": "c", "time": 124, "reciever": "b", "amount": 10}, {"sender": "a", "time": 130, "reciever": "b", "amount": 5}]
dict1={}
for eachitem in dict:
        if(eachitem["sender"] in dict1.keys()):
             dict1[eachitem["sender"]]["amount"]=dict1[eachitem["sender"]]["amount"]+eachitem["amount"]
             dict1[eachitem["sender"]]["reciever"]+=1

        else:
            dict1[eachitem["sender"]]={"reciever":1,"amount":eachitem["amount"]}

print(dict1)

output

{'a': {'reciever': 2, 'amount': 7}, 'c': {'reciever': 1, 'amount': 10}}
SRG
  • 345
  • 1
  • 9