Lets take an example, suppose my table values are:
subjects
english
mathematics
science
english
science
how can i convert these string data into numbered data as shown in table below.
subjects
1
2
3
1
3
Lets take an example, suppose my table values are:
subjects
english
mathematics
science
english
science
how can i convert these string data into numbered data as shown in table below.
subjects
1
2
3
1
3
Assuming your original dataframe looks like this:
>>> df
subjects
0 english
1 mathematics
2 science
3 english
4 science
you could use pd.factorize
:
df['factor'] = pd.factorize(df['subjects'])[0]+1
>>> df
subjects factor
0 english 1
1 mathematics 2
2 science 3
3 english 1
4 science 3
or, if you simply want to replace the values in subjects
rather than create a new column factor
, do this:
df['subjects'] = pd.factorize(df['subjects'])[0]+1
Note that the +1
is simply to get your exact output ranging from 1 to 3. Without it, you will still get valid categories, only ranging from 0 to 2.