I have a dataframe with a column consisting of author names, where sometimes the name of an author repeats. My problem is: I want to assign a unique number to each author name in a corresponding parallel column (for simplicity, assume that this numbering follows the progression of whole numbers, starting with 0, then 1, 2, 3, and so on).
I can do this using nested FOR loops, but with 57000 records consisting of 500 odd unique authors, it is taking way too long. Is there a quicker way to do this?
For example,
Original DataFrame contains:
**Author**
Name 1
Name 2
Name 1
Name 3
I want another column added next to it, such that:
**Author** **AuthorID*
Name 1 1
Name 2 2
Name 1 1
Name 3 3