With a dataframe with duplicated values for an index ix
suxh as this:
ix value
0 1 12
1 2 14
2 2 15
3 2 12
4 3 10
5 4 9
6 5 14
7 5 21
How can you include a column that identifies and counts the number of duplicated values for ix
?
Desired result:
ix value dupes
0 1 12 1
1 2 14 1
2 2 15 2
3 2 12 3
4 3 10 1
5 4 9 1
6 5 14 1
7 5 21 2
I've tried different approaches with pd.merge
, for example
pd.merge(df, df.groupby('ix').size().to_frame('dupes').reset_index(), on = 'ix', how = 'left')
But as you can see this will only get the total number of each duplicate:
ix value dupes
0 1 12 1
1 2 14 3
2 2 15 3
3 2 12 3
4 3 10 1
5 4 9 1
6 5 14 2
7 5 21 2