I read this post but not exactly applicable to my case pandas-resetting cumsum at a specific number
My expected output on column "cumulation":
comment count pre_cnt_diff cumulation
auto 1 10 0 0
auto 2 30 20 20
auto 3 70 40 60
auto 4 120 50 110
auto 5 120 0 110
auto 6 130 10 120
auto 7 150 20 140
manual_input_1 150 0 0
auto 8 200 50 50
auto 9 230 30 80
manual_input_2 230 0 0
My current output on column "cumulation":
comment count pre_cnt_diff cumulation
auto 1 10 0 0
auto 2 30 20 20
auto 3 70 40 60
auto 4 120 50 110
auto 5 120 0 0
auto 6 130 10 10
auto 7 150 20 30
manual_input_1 150 0 0
auto 8 200 50 50
auto 9 230 30 80
manual_input_2 230 0 0
This is my code:
import pandas as pd
d = {'comment': ['auto 1', 'auto 2', 'auto 3', 'auto 4', 'auto 5', 'auto 6',
'auto 7', 'manual input 1', 'auto 8', 'auto 9', 'manual input 2'],
'count': [10,30,70,120,120,130,150,150,200,230,230]}
df = pd.DataFrame(data=d)
df['pre_cnt_diff'] = df['count'].diff().fillna(0)
g = df.pre_cnt_diff.eq(0).cumsum().fillna(0)
df['cumulation'] = df.groupby(g).pre_cnt_diff.cumsum()
I want the column "cumulation" to reset to 0 when comment has a special pattern of string "manual_input" and continue to cumsum from last 0 value. I think this must to use regex to search my comment column and add if-else statement. Much appreciated if anyone can assist to modify my code.