0

I have this dataset consisting of two variables, A and B. Variable A consists of a list of numbers. Not all numbers are different. I want to fill variable B in each row with the number N, where N is the number of times A has appeared so far.

This is the dataframe I have:

A      B
2101    
2101    
2102    
2102    
2102    
2103    
2104    
2104    
2104    
2104    

Here is how I want the output to be:

A       B
2101    1
2101    2
2102    1
2102    2
2102    3
2103    1
2104    1
2104    2
2104    3
2104    4
aschultz
  • 1,658
  • 3
  • 20
  • 30
Yubraj Bhusal
  • 375
  • 3
  • 12

1 Answers1

-1

You can simply do that with this

df['B']=df.groupby('A').cumcount()+1  # +1 as the index starts with 0 

reference : pandas.core.groupby.GroupBy.cumcount

Sundeep Pidugu
  • 2,377
  • 2
  • 21
  • 43