I have an issue using Pandas and cumsum which is not behaving as I was expecting so was wondering if anyone could shed some light on how this works.
I have a dataframe that looks as follows:
| |price |flag |cum_sum |
|-----|---------|------|---------|
|0 |2 |1 |2 |
|1 |5 |1 |7 |
|2 |8 |1 |15 |
|3 |9 |0 |0 |
|4 |12 |0 |0 |
|5 |2 |1 |17 |
Currently the code looks as follows:
df['cum_sum'] = df.groupby(by=['flag','price']).sum().groupby(level=[1]).cumsum()
I only want it to sum a column where a flag is specified. I feel like this should be simple but i'm missing something fundamental. The dataset is huge so was not looking for any loops or iteration answers.