Convert unique category names to integer

Question

In the Iris dataset the 'target_names' or flower labels ('setosa', 'versicolor', 'virginica') are represented by a 'target' which is either 0, 1 or 2:

iris = load_iris()
iris

'target': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]), 'target_names': array(['setosa', 'versicolor', 'virginica'], dtype='|S10')}

Now I have a training data set which looks something like this:

> Photography         0.1 0.1 0.1 0.1 0.1
> Social              0.2 0.2 0.2 0.2 0.2
> Libraries and Demo  0.3 0.3 0.3 0.3 0.3
> Arcade and Action   0.4 0.4 0.4 0.4 0.4
> Health and  Fitness 0.5 0.5 0.5 0.5 0.5

How can I change my labels ('Photography', 'Social' etc) to be represented by target values, that is 0,1,2 etc, like we see in the Iris dataset?

There are 30 unique labels in total across 20,000 rows and 14,000 columns.

See here: http://stackoverflow.com/questions/20250771/remap-values-in-pandas-column-with-a-dict — Arya McCarthy, May 04 '17 at 08:09
Thank you! This helped to solve my problem - appreciate the link : ) — Ali, May 04 '17 at 10:14

Convert unique category names to integer

0 Answers0