I have a dataframe which is populated by answers from a Google Form. The new_cycle
column is yes/no question on the form. In my dataframe, I need to calculate the cycle_day
for each response. In other words, I need the 'yes' to be 1, and every 'no' afterwards to increment by 1. Then, once another 'yes' response is recorded, the count resets.
Here is a minimal, reproducible example. Note that I'm actually doing this on a much larger dataframe, so an optimal solution is paramount.
df = pd.DataFrame(['yes', 'no', 'no', 'no', 'yes', 'no'], columns=['new_cycle'])
# df:
new_cycle
0 yes
1 no
2 no
3 no
4 yes
5 no
My desired output would be:
new_cycle
0 1
1 2
2 3
3 4
4 1
5 2
# OR:
new_cycle cycle_day
0 yes 1
1 no 2
2 no 3
3 no 4
4 yes 1
5 no 2
How would I do this?