0

I am trying to count consecutive zeros:

Every time a non-zero value appears on binary column, the counting on Consec Column restart

     binary consec
1       1      0
2       0      1
3       0      2
4       0      3
5       0      4
5       1      0
6       0      1
7       0      2
8       1      0

Doing this Solution, I can accomplish it

df = pd.DataFrame({"binary": [0,1,1,1,0,0,1,1,0]})
df["consec"] = df["binary"].groupby((df["binary"] != 0).cumsum()).cumcount()

   binary  consec
1       1      0
2       0      1
3       0      2
4       0      3
5       0      4
5       1      0
6       0      1
7       0      2
8       1      0

However, I would like to do the same for multi-index situations like this:

import pandas as pd
df = pd.DataFrame({"gp_1": [1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2], "gp_2": [1,1,1,1,1,1,1,1,1,3,3,3,3,4,4,4,4,4], "binary": [0,1,1,1,0,0,1,1,0, 0,1,1,1,0,0,1,1,0]})

Expected Output:

  gp_1 gp_2 binary  consec
0   1   1   0         1  
1   1   1   1         0
2   1   1   1         0
3   1   1   1         0
4   1   1   0         1
5   1   1   0         2
6   1   1   1         0
7   1   1   1         0
8   1   1   0         1
9   2   3   0         1
10  2   3   1         0
11  2   3   1         0
12  2   4   0         1
13  2   4   0         2
14  2   4   0         3
15  2   4   1         0
16  2   4   1         0
17  2   4   0         1
William
  • 512
  • 5
  • 17
  • what is your expected output? Please explain what do you mean by multi-indexed situation – anky Aug 23 '20 at 18:41
  • Sorry, my mistake. I just posted a updated post. – William Aug 23 '20 at 18:46
  • change `.groupby((df["binary"] == 0)` to `.groupby(["gp_1", "gp_2", df["binary"] == 0])`? – Dan Aug 23 '20 at 18:49
  • Doing 'df.groupby(["gp_1", "gp_2", df["binary"] != 0]).cumsum().cumcount()' or df["binary"] .groupby(["gp_1", "gp_2", df["binary"] != 0]).cumsum().cumcount() result in error – William Aug 23 '20 at 19:02

1 Answers1

2

Let us try

df.groupby([df.gp_1,df.gp_2,df.binary.diff().ne(0).cumsum()]).cumcount().add(1).where(df.binary==0,0)
Out[149]: 
0     1
1     0
2     0
3     0
4     1
5     2
6     0
7     0
8     1
9     1
10    0
11    0
12    1
13    2
14    3
15    0
16    0
17    1
dtype: int64
BENY
  • 317,841
  • 20
  • 164
  • 234