Given a large dataset of one million records, I am looking for ways to do a group by. I am new to python, but i know in SQL there's a groupby function and i am guessing it might be applicable.
What i want to achieve is this,
From
["A", 4]
["B", 4]
["F", 3]
["A", 4]
["B", 1]
To
["A", (4,4)]
["B", (1,4)]
["F", (3)]
I am also looking for an efficient way to calculate the average of the list of ratings. So finally the output should be:
["A", 4]
["B", 2.5]
["F", 3]
I've tried to do a iterative approach to it but the error thrown was "there was too much data to unpack". Here is my solution which is not workng for the dataset.
len = max(key for (item, key) in results)
newList = [[] for i in range(len+1)]
for item, key in results:
newList[key].append(item)
I am looking for efficient way to do it, is there a way to do a groupby in list comprehension? Thanks!