2

I have a dataframe with 2 columns, the objective here is simple ; reset the df.cumsum() if a row column is set to False;

df

      value      condition
0       1            1
1       2            1
2       3            1
3       4            0
4       5            1

the wanted result is as follows :

df

      value      condition
0       1            1
1       3            1
2       6            1
3       4            0
4       9            1

If i loop over the dataframe as described in this post Python pandas cumsum() reset after hitting max i can achieve the wanted results, but i was looking for a more vectorized way using pandas standard functions

ReKx
  • 996
  • 2
  • 10
  • 23
The Other Guy
  • 576
  • 10
  • 21

1 Answers1

6

How about:

df['cSum'] = df.groupby((df.condition == 0).cumsum()).value.cumsum()

Output:

   value  condition  cSum
0      1          1     1
1      2          1     3
2      3          1     6
3      4          0     4
4      5          1     9

You'll group consecutive rows together until you encounter a 0 in the condition column, and then you apply the cumsum within each group separately.

ALollz
  • 57,915
  • 7
  • 66
  • 89
  • This is the exact problem I am trying to sovle, however the accepted solution raises an error in the current version of pandas. Do you know how to fix that ? – UGuntupalli Jun 08 '21 at 01:55