In your solution remove shift
and for remove 0
from back per groups change ordering of values in group by iloc[::-1]
:
mask = (df.groupby('user_id')['target']
.apply(lambda x: x.iloc[::-1].eq(1).cumsum().ne(0).iloc[::-1]))
df = df[mask]
For better performance is possible use if only 0
and 1
values in target:
mask = df.iloc[::-1].groupby('user_id')['target'].cumsum().ne(0).iloc[::-1]
df = df[mask]
If also another values like 0,1
use:
mask = (df.iloc[::-1]
.assign(new = lambda x: x['target'].eq(1))
.groupby('user_id')['new']
.cumsum().ne(0)
.iloc[::-1])
df = df[mask]
If need avoid remove only 0
groups use:
mask = df.groupby('user_id')['target'].transform('any')
mask1 = (df.iloc[::-1]
.assign(new = lambda x: x['target'].eq(1))
.groupby('user_id')['new']
.cumsum().ne(0)
.iloc[::-1])
df = df[~mask | mask1]