33

I have a dataset loaded by dataframe where the class label needs to be encoded using LabelEncoder from scikit-learn. The column label is the class label column which has the following classes:

[‘Standing’, ‘Walking’, ‘Running’, ‘null’]

To perform label encoding, I tried the following but it does not work. How can I fix it?

from sklearn import preprocessing
import pandas as pd

df = pd.read_csv('dataset.csv', sep=',') 
df.apply(preprocessing.LabelEncoder().fit_transform(df['label']))
Darshan Jain
  • 781
  • 9
  • 19
Kristofer
  • 1,457
  • 2
  • 19
  • 27
  • 1
    If you just run `preprocessing.LabelEncoder().fit_transform(df['label'])` on its own, outside of `apply()`, do you get the encoded labels? – andrew_reece May 09 '18 at 17:29
  • Yes you are right, the error disappears but I don't see encoding! The classes are not transformed. That's why I use `apply()` so that the transformation applied in the dataframe – Kristofer May 09 '18 at 17:34
  • `apply()` accepts a function, which it will apply to the each point. Here you are sending the transformed data to `apply()`, not a function and hence the error. – Vivek Kumar May 10 '18 at 05:35

3 Answers3

61

You can try as following:

le = preprocessing.LabelEncoder()
df['label'] = le.fit_transform(df.label.values)

Or following would work too:

df['label'] = le.fit_transform(df['label'])

It will replace original label values in dataframe with encoded labels.

niraj
  • 17,498
  • 4
  • 33
  • 48
4

You can also do:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df.col_name= le.fit_transform(df.col_name.values)

where col_name = the feature that you want to label encode

Darshan Jain
  • 781
  • 9
  • 19
2
 from sklearn.preprocessing import LabelEncoder
 le = LabelEncoder()
 X[:, 2] = le.fit_transform(X[:, 2]) 

this could be helpful if you want to change the particular column in your CSV data