0

I have a Pandas dataframe like this:

df = pd.DataFrame({
    'Date': ['2018-04-01', '2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01'],
    'Product': ['a', 'a', 'a', 'b', 'b'],
    'Volumes': [10,30,40,50,60]})

Date       Product   Volumes
2018-04-01  a        10
2018-05-01  a        30
2018-06-01  a        40
2018-07-01  b        50
2018-08-01  b        60

I want to create a column which sums all the volumes of all preceding months for the same product only.

The desired output would be:


df = pd.DataFrame({
    'Date': ['2018-04-01', '2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01'],
    'Product': ['a', 'a', 'a', 'b', 'b'],
    'Volumes': [10,30,40,50,60],
    'Result': [10,40,80,50,110] })


Date        Product   Volumes   Result
2018-04-01   a         10       10
2018-05-01   a         30       40
2018-06-01   a         40       80
2018-07-01   b         50       50
2018-08-01   b         60       110

Does anyone have any advice on how to do this?

Thanks!

Amy D
  • 55
  • 1
  • 6

1 Answers1

1

Use:

df['Result']=df.groupby('Product')['Volumes'].cumsum()

         Date Product  Volumes  Result
0  2018-04-01       a       10      10
1  2018-05-01       a       30      40
2  2018-06-01       a       40      80
3  2018-07-01       b       50      50
4  2018-08-01       b       60     110
anky
  • 74,114
  • 11
  • 41
  • 70