I have the following df
,
id year_month pct
10 201901 10
20 201901 5
30 201901 3
40 201901 2
10 201902 8
20 201902 2
30 201902 7
40 201902 3
I want to sort pct
, and then groupby
year_month
; then do a cumsum
on pct
which needs to be > 10
;
df.sort_values(['pct']).groupby('year_month')['pct'].apply(lambda x: x.cumsum().gt(10))
but it only gave me a series
3 False
5 False
2 False
7 False
1 False
6 True
4 True
0 True
Name: pct, dtype: bool
I am wondering how to get this series back to df
as a column,
id year_month pct non-tail
10 201901 10 True
20 201901 5 False
30 201901 3 False
40 201901 2 False
10 201902 8 True
20 201902 2 True
30 201902 7 False
40 201902 3 False