0

I'm currently working on Titanic dataset. It consists of 4-5 non numeric columns. I want to apply sklearn.LabelEncoder class to get encoded values for these non-numeric columns. I can, no doubt, apply this method one by one to each column. But the job will become more tedious when there're more than 20-30 such columns. Since I know the name of such non-numeric columns, is there any sophisticated way to do so in ease manner?

Nuance
  • 101
  • 2
  • 14

1 Answers1

-1

Just run a loop after selecting object types

obj_cols = df.select_dtypes(include=[object])

for i in obj_cols:
    df[i+'label'] = le.fit_transform(df[i])
Abhishek Sharma
  • 1,909
  • 2
  • 15
  • 24
  • Using a single labelencoder object `le` will be problematic when using on train and test data. – Vivek Kumar Dec 17 '17 at 10:25
  • It's always advisable to combine train and test data before performing label encoding. If you run label encoder separately you always run in the risk of having new categories in test data – Abhishek Sharma Dec 17 '17 at 16:26
  • I wouldn't say "combine train and test data before..." for anything, because the point of "test" is to simulate new data you get in production, and you don't know in advance what that will come in like – Max Power Dec 18 '17 at 16:22