Create an index column by group

Question

I would like to index my dataframe such that in each group it starts from 0 to the number of observations in the group. Ie from :

pd.DataFrame([["John","Car"],["John","House"],["Sam","Skate"],["Sam","Disco"],["Sam","Space"]])

I would like to have :

pd.DataFrame([["John","Car",0],["John","House",1],["Sam","Skate",0],["Sam","Disco",1],["Sam","Space",2]])

Thanks

score 3 · Answer 1 · answered Feb 11 '19 at 13:49

3

Youre looking for the cumulative count function:

df = pd.DataFrame([["John","Car"],["John","House"],["Sam","Skate"],["Sam","Disco"],["Sam","Space"]])
df.groupby(0).cumcount()

answered Feb 11 '19 at 13:49

Zulfiqaar

anky · Answer 2 · 2019-02-11T14:00:10.073

2

Use:

df.groupby(0)[0].apply(lambda x:x.duplicated().cumsum())

edited Feb 11 '19 at 14:00

answered Feb 11 '19 at 13:48

anky

2 Answers2