Is it possible to encode multiple selected columns at once?

Question

I have a list of columns in my large dataframe that are catergorical and I'm trying to encode them because some of the algo's I'm using do not accept strings(knn for example).

Here's my code:

#encode categories
from sklearn.preprocessing import LabelEncoder
# LabelEncoder
le = LabelEncoder()
# dataImputed[catgoricalValues] = dataImputed[catgoricalValues].apply(le.fit_transform) #didn't work
dataImputed[catgoricalValues] = le.fit_transform(dataImputed[catgoricalValues].astype(str))

I got this error:

ValueError: y should be a 1d array, got an array of shape (490546, 11) instead.

What can I do to only encode those values in my catgoricalValues list while maintaining all other values in my dataframe?

[see this answer](https://stackoverflow.com/questions/24458645/label-encoding-across-multiple-columns-in-scikit-learn) — RakeshV, Jul 20 '20 at 18:04
@RakeshV I tried that, see my commented out line in the code.Didn't work. Complained about mix of strings/floats so I used the command below to ensure everything was str. — Lostsoul, Jul 20 '20 at 18:07

RakeshV · Accepted Answer · 2020-07-20T19:27:00.883

1

Try this:

import pandas as pd

from sklearn.preprocessing import LabelEncoder

def MultiLabelEncoder(columnlist,dataframe):
    for i in columnlist:
        labelencoder_X=LabelEncoder()
        dataframe[i]=labelencoder_X.fit_transform(dataframe[i])

MultiLabelEncoder(catgoricalValuesColumnNameList,dataImputed)

edited Jul 20 '20 at 19:27

answered Jul 20 '20 at 18:44

RakeshV

444
3
11

Is it possible to encode multiple selected columns at once?

1 Answers1