-5

I want to aggregate a Python Dictionary I have by date and by sales. For example, my dictionary looks like this:

{'Sales Data Year': 2022,
'rawData': [{
'id': 23490832405,
'sum': [100, 4324, 4564, 23, 77],
'startTime': '2022-01-01T00:00:00.000Z',
'endTime': '2022-01-01T08:00:00.000Z',
'indicator': 8,
'clients': 324},
{'id': 4239084235,
'sum': [321, 456, 234, 7865, 233],
'startTime': '2022-01-01T00:00:00.000Z',
'endTime': '2022-01-01T08:00:00.000Z',
'indicator': 8,
'clients': 543},
{'id': 235908221,
'sum': [100, 4324, 4564, 23, 77],
'startTime': '2022-01-01T08:00:00.000Z',
'endTime': '2022-01-01T16:00:00.000Z',
'indicator': 8,
'clients': 324},
{'id': 23534543663,
'sum': [654, 43, 564, 654, 823],
'startTime': '2022-01-01T16:00:00.000Z',
'endTime': '2022-01-02T00:00:00.000Z',
'indicator': 8,
'clients': 6453},
{'id': 429234802452,
'sum': [423, 4324, 4431, 23, 765],
'startTime': '2022-01-02T00:00:00.000Z',
'endTime': '2022-01-02T08:00:00.000Z',
'indicator': 8,
'clients': 867}]}

How can I aggregate everything by date (e.g., 2022-01-01, 2022-01-02, ...), not by time? For example, I want to be able to sum all the values in the sum Array, sum the clients field, and average the indicator field.

I've tried looping through the dictionary, but I am having a hard time dealing with combining the dates since they have time stamps attached.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
325
  • 594
  • 8
  • 21
  • @esqew I've tried looping through the dictionary, but I am having a hard time dealing with combining the dates since they have time stamps attached. – 325 Feb 17 '22 at 17:04
  • Can you share the code of your attempt as a [mre] in the body of your question in accordance with [ask] to illustrate where *specifically* you're getting stuck? – esqew Feb 17 '22 at 17:08
  • Does this answer your question? [How do I convert a datetime to date?](https://stackoverflow.com/questions/3743222/how-do-i-convert-a-datetime-to-date) – esqew Feb 17 '22 at 17:11

1 Answers1

0

You could try this.

#Assign your data to a dictionary
test = {'Sales Data Year': 2022, 'rawData': [{ ....... your full data }

#your useful data is in key rawData 
df = pd.DataFrame(test['rawData'])
df['sum'] = df['sum'].apply(lambda x: sum(x))
df['startTime'] = df['startTime'].apply(pd.to_datetime)

#Group by date and print the sum
print(df.groupby([df['startTime'].dt.date]).sum())

Check this question to get other values like mean, std etc. Python pandas groupby aggregate on multiple columns, then pivot

Manjunath K Mayya
  • 1,078
  • 1
  • 11
  • 20