How to index column uniquely in Python using Pandas?

Question

I am trying to generate a unique index column in my dataset.

I have a column in my dataset as follows: 665678, 665678, 665678, 665682, 665682, 665682, 665690, 665690

And I would like to generate a separately indexed column looking like this: 1, 1, 1, 2, 2, 2, 3, 3

I came across the post How to index columns uniquely?? that describes exactly what I am trying to do. But since the solutions are described for R, I wanted to know how can I implement the same in Python using Pandas.

Thanks

Thank you guys. Both the solutions work and having read the [Pandas DENSE RANK](https://stackoverflow.com/questions/39357882/pandas-dense-rank), `factorize` seems to be the right option considering that my data is sorted — , Jan 03 '19 at 15:18

score 1 · Accepted Answer · answered Jan 03 '19 at 15:06

1

Use -

df.groupby('col').ngroup()+1

Output

0    1
1    1
2    1
3    2
4    2
5    2
6    3
7    3
dtype: int64

answered Jan 03 '19 at 15:06

Vivek Kalyanarangan

8,951
1
23
42

1

Hey @Vivek, thanks for the help – Jan 03 '19 at 15:25

How to index column uniquely in Python using Pandas?

1 Answers1