2

How to get the data frame below

dd = pd.DataFrame({'val':[0,0,1,1,1,0,0,0,0,1,1,0,1,1,1,1,0,0],
             'groups':[1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,'ignore','ignore']})

     val    groups
0     0       1
1     0       1
2     1       1
3     1       1
4     1       1
5     0       2
6     0       2
7     0       2
8     0       2
9     1       2
10    1       2
11    0       3
12    1       3
13    1       3
14    1       3
15    1       3
16    0  ignore
17    0  ignore

I have a series df.val with has values [0,0,1,1,1,0,0,0,0,1,1,0,1,1,1,1,0,0].
How to create df.groups from df.val.

first 0,0,1,1,1 will form group 1,(i.e. from the beginning upto next occurrence of 0 after 1's)
0,0,0,0,1,1 will form group 2, (incremental group number, starting where previous group ended uptill next occurrence of 0 after 1's),...etc

Can anyone please help.

Shijith
  • 4,602
  • 2
  • 20
  • 34

2 Answers2

2

First test if next value after 0 is 1 and create groups by sumulative sums by Series.cumsum:

s = (dd['val'].eq(0) & dd['val'].shift().eq(1)).cumsum().add(1)

Then convert last group to ignore if last value of data are 0 with numpy.where:

mask = s.eq(s.max()) & (dd['val'].iat[-1] == 0)
dd['new'] = np.where(mask, 'ignore', s)
print (dd)
    val  groups     new
0     0       1       1
1     0       1       1
2     1       1       1
3     1       1       1
4     1       1       1
5     0       2       2
6     0       2       2
7     0       2       2
8     0       2       2
9     1       2       2
10    1       2       2
11    0       3       3
12    1       3       3
13    1       3       3
14    1       3       3
15    1       3       3
16    0  ignore  ignore
17    0  ignore  ignore
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

IIUC first we do diff and cumsum , then we need to find the condition to ignore the previous value we get (np.where)

s=df.val.diff().eq(-1).cumsum()+1
df['New']=np.where(df['val'].eq(1).groupby(s).transform('any'),s,'ignore')
df
    val  groups     New
0     0       1       1
1     0       1       1
2     1       1       1
3     1       1       1
4     1       1       1
5     0       2       2
6     0       2       2
7     0       2       2
8     0       2       2
9     1       2       2
10    1       2       2
11    0       3       3
12    1       3       3
13    1       3       3
14    1       3       3
15    1       3       3
16    0  ignore  ignore
17    0  ignore  ignore
BENY
  • 317,841
  • 20
  • 164
  • 234