1
INPUT DATA-
array([['00:00:00', 20, 15.27],
       ['00:15:00', 20, 9.07],
       ['00:30:00', 20, 7.33],
       ...,
       ['00:30:00', 407, 34.0],
       ['00:00:00', 407, 172.0],
       ['00:10:00', 407, 187.0]], dtype=object)

First column - time second column - id third column - price

60k+ rows

Need to find sum of price per id for each time.

I am trying to work without the GROUPBY function

How can I achieve this? I've been trying using this.

result={}
for t,id,price in trial.inputs():
    result[t]={}
    if id not in result[t]:
        result[t][id]=0
    result[t][id]+=price
print (result)
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
v_space
  • 11
  • 1
  • Something like [Is there any numpy group by function?](https://stackoverflow.com/q/38013778/15497888) – Henry Ecker Jul 24 '21 at 17:48
  • It is similar, but my problem wants me to group in 10mins intervals, and further group the total sum of prices based on ID. – v_space Jul 24 '21 at 17:55
  • And why are we avoiding `pandas` functions? – Henry Ecker Jul 24 '21 at 17:55
  • It's an assessment to not use direct libraries. I'm trying to loop it through. I'm able to group the prices in the ids using the for loop, but unable to connect the time to it. – v_space Jul 24 '21 at 18:03

2 Answers2

0

Update:

from collections import defaultdict
d = defaultdict(list)
for t,id,price in trial:
    d[t,id].append(price)
print (d)

I am able to group the prices based on t and id. How do I find the sum of the prices for each id?

v_space
  • 11
  • 1
0

Extending your answer, you can iterate the dictionary one more time to find the sum of value (which in this case is a list)

import numpy as np
from collections import defaultdict

data_array = np.array([['00:00:00', 20, 15.27],
                       ['00:15:00', 20, 9.07],
                       ['00:30:00', 20, 7.33],
                       ['00:30:00', 407, 34.0],
                       ['00:00:00', 407, 172.0],
                       ['00:10:00', 407, 187.0]], dtype=object)

print(data_array)

# Your solution to store the price values to corresponding [t, id].
time_id_to_price = defaultdict(list)
for t, id, price in data_array:
    time_id_to_price[t, id].append(price)
print(time_id_to_price)

# Make a copy of previous dictionary and assign the sum of values.
time_id_to_price_sum = time_id_to_price.copy()
for t, id, price in data_array:
    time_id_to_price_sum[t, id] = sum(time_id_to_price_sum[t, id])
print(time_id_to_price_sum)
Ibrahim Berber
  • 842
  • 2
  • 16