I have a pandas dataframe that looks like this
import pandas as pd
foo = pd.DataFrame({'id': [1,1,1,1,1,2,2,2,2,2],
'col_a': [1,1,0,1,0,1,1,1,0,1],
'col_b': [0,1,1,0,0,0,1,1,1,0]})
I would like to create 2 columns:
- col_a_consequent:
1
ifcol_a
hasn
consequent occurrences of1
s byid
- col_c:
1
if after2
consequent occurrences of1
s atcol_a
there is1
atcol_b
The output dataframe looks like this:
for n=3
foo = pd.DataFrame({'id': [1,1,1,1,1,2,2,2,2,2],
'col_a': [1,1,0,1,0,1,1,1,0,1],
'col_b': [0,1,1,0,0,0,1,1,1,0],
'col_a_consequent': [0,0,0,0,0,1,1,1,0,0],
'col_c': [1,1,1,0,0,1,1,1,1,0]})
For col_a_consequent
according to this question I can obtain what I want
n = 3
foo_tmp = foo.query('id == 2')
(foo_tmp.col_a.groupby((foo_tmp.col_a != foo_tmp.col_a.shift()).\
cumsum()).transform('size') * foo_tmp.col_a >= n).astype(int)
but I dont know how I can do the same operation with groupby
for all id
s
Any ideas ?