0

I have a pandas dataframe with dates as index and several criterions (customer group, quantity, ...). Dates can occur multiple times as there can be different sells every day. An example could be:

date --- quantity --- group

2018-03-22 --- 1 --- 100

2018-03-24 --- 10 --- 102

2018-03-25 --- 5 --- 102

2018-03-25 --- 2 --- 100

2018-03-25 --- 3 --- 102

What would be an elegant solution to sum up all elements with a specific date (index) that are in one group. So I'd like to have a dataframe that has only the unique dates as indices and columns that are the group numbers and the elements are the sum over the different quantities, so e.g. for 2018-03-25 it is 8 for group 102 and 2 for group 100.

Any help is appreciated. Thanks, Franz.

  • I think need `df = df.groupby(['date','group'])['quantity'].sum().reset_index()` – jezrael Mar 25 '18 at 13:13
  • Thanks a lot. That answers my question perfectly. How would you then create a dataframe which columns are the different groups, index is dates and entries are the calculated sums? – Newbie2014 Mar 25 '18 at 16:12
  • If solution from comment above not working, use `df = df.reset_index().groupby(['date','group'])['quantity'].sum().reset_index()` – jezrael Mar 25 '18 at 16:13
  • Thanks again for your quick answer. Your solution from above is working. I was wondering how to get a dataframe looking like: date --- group100 --- group102 \n 2018-03-24 --- 0 --- 10 \n 2018-03-25 ---2 --- 8 – Newbie2014 Mar 25 '18 at 16:25
  • I think need pivot - https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe – jezrael Mar 25 '18 at 16:26
  • 1
    Thanks again for you help. – Newbie2014 Mar 25 '18 at 16:30

0 Answers0