I have a data frame structured like below:
group maybe_start maybe_end
0 ABC False False
1 ABC True False
2 ABC False False
3 ABC False False
4 ABC True False
5 ABC False False
6 ABC False True
7 ABC False False
8 DEF False False
9 DEF False False
10 DEF True False
11 DEF False False
12 DEF False True
13 DEF False False
14 DEF False False
15 DEF False True
16 DEF True False
17 DEF False False
18 DEF False True
I need to create a separate column, let's say group2
, that will note the group defined by the moments of start and end. Therefore, every group in group2
should start, whenever there's first True value in maybe_start
column after previous maybe_end==True
and end on the first occurrance of maybe_end==True
after the start. In other words, we start a new value in group2
at maybe_start==True
(in this example in row 1) and every next row of group2
will get the same value until there's occurance of maybe_end==True
(here, in row 6). All of this needs to be done within groupby where groups are created based on the group
column. Therefore, the expected output should look as follows:
group maybe_start maybe_end group2
0 ABC False False NaN
1 ABC True False 1.0
2 ABC False False 1.0
3 ABC False False 1.0
4 ABC True False 1.0
5 ABC False False 1.0
6 ABC False True 1.0
7 ABC False False NaN
0 DEF False False NaN
1 DEF False False NaN
2 DEF True False 1.0
3 DEF False False 1.0
4 DEF False True 1.0
5 DEF False False NaN
6 DEF False False NaN
7 DEF False True NaN
8 DEF True False 2.0
9 DEF False False 2.0
10 DEF False True 2.0
How can I achieve this in vectorised way in Pandas?