Label Encoding refers to converting categorical labels in a data set used for machine learning purposes, into numeric form. Machine learning algorithms can then decide in a better way on how those labels must be operated. It is an important pre-processing step for a structured data set in supervised learning.
Questions tagged [label-encoding]
119 questions
33
votes
3 answers
How to apply LabelEncoder for a specific column in Pandas dataframe
I have a dataset loaded by dataframe where the class label needs to be encoded using LabelEncoder from scikit-learn. The column label is the class label column which has the following classes:
[‘Standing’, ‘Walking’, ‘Running’, ‘null’]
To perform…

Kristofer
- 1,457
- 2
- 19
- 27
7
votes
3 answers
Why should LabelEncoder from sklearn be used only for the target variable?
I was trying to create a pipeline with a LabelEncoder to transform categorical values.
cat_variable = Pipeline(steps = [
('imputer',SimpleImputer(strategy = 'most_frequent')),
('lencoder',LabelEncoder())
])
…

VisnuGanth
- 71
- 1
- 9
5
votes
1 answer
Use same Label Encoder for train and test dataframes
I have 2 different csv which has a train data and test data. I created two different dataframes from these train_features_df and test_features_df. Note that the test and train data have multiple categorical columns, so i need to apply labelEncoder…

Invictus
- 4,028
- 10
- 50
- 80
4
votes
1 answer
Value Error: y contains previously unseen labels:
I've used Decision Tree Classifier and I want to enter my input as a string rather than giving an integer
value, but it gives me error like:
Traceback (most recent call last):
File "D:/backup code for odoo project/New folder/New folder/main.py",…

Neel Parghi
- 43
- 1
- 1
- 6
4
votes
2 answers
How do I create a function to perform label encoding
I have the dataframe -
df = pd.DataFrame({'colA':['a', 'a', 'a', 'b' ,'b'], 'colB':['a', 'b', 'a', 'c', 'b'], 'colC':['x', 'x', 'y', 'y', 'y']})
I would like to write a function to replace each value with it's frequency count in that column. For…

The Rookie
- 877
- 8
- 15
3
votes
2 answers
How to encode a dataset having multiple datatypes?
I have a dataset like:
e = pd.DataFrame({
'col1': ['A', 'A', 'B', 'W', 'F', 'C'],
'col2': [2, 1, 9, 8, 7, 4],
'col3': [0, 1, 9, 4, 2, 3],
'col4': ['a', 'B', 'c', 'D', 'e', 'F']
})
Here I encoded the data using…

Samar Pratap Singh
- 471
- 1
- 10
- 29
3
votes
0 answers
LabelEncoder: ValueError- y contains previously unseen labels:
I'm using random forest for prediction i want to what is wrong in y code and is the encoding done correctly `
import warnings
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import…

NAVYA REDDY
- 301
- 1
- 2
- 7
3
votes
0 answers
Parameter of OneHotEncoder : Categories
I have been coding on ML via Scikit-learn from few months.
but a update has came on scikit object of preprocessing which is OneHotEncoder.
here was a parameter categorical_features which is now changed to categories and now i am not understanding…

Harshit Gautam
- 33
- 5
2
votes
2 answers
Label encode subgroups after groupby
I want to label encode subgroups in a pandas dataframe. Something like this:
| Category | | Name |
| ---------- | | --------- |
| FRUITS | | Apple |
| FRUITS | | Orange |
| FRUITS | | Apple |
| Vegetables | | Onion …

rohit deraj
- 95
- 6
2
votes
0 answers
Reversing Sci-Kit LabelEncoder, but have a 2D array dataset
I'm trying to create an automated data pre-processing library and I want to transform the string data into numerical so it can be ran through ML algorithms. But I can't seem to reverse it back to its original state, which should be relatively simple…

brockwill1
- 21
- 1
2
votes
2 answers
LabelEncoder().fit_transform gives me negative values?
Hei,
I have different city names in the column "City" in my dataset. I would love to encode it using LabelEncoder(). However, I got quite frustrating results with negative values
df['city_enc'] =…

Nguyen Ngoc Lan
- 23
- 2
2
votes
1 answer
convert data with LabelEncoder
I wrote this function to convert categorical features with LabelEncoder
#convert columns to dummies with LabelEncoder
cols = ['ToolType', 'TestType', 'BatteryType']
#apply ene hot encoder
le = LabelEncoder()
for col in cols:
data[col] =…

apham15
- 89
- 5
2
votes
1 answer
Using a LabelEncoder in sklearn's Pipeline gives: fit_transform takes 2 positional arguments but 3 were given
I've been trying to run some ML code but I keep faltering at the fitting stage after running my pipeline. I've looked around on various forums to not much avail. What I've discovered is that some people say you can't use LabelEncoder within a…
user14046737
2
votes
1 answer
ValueError: The truth value of a Series is ambiguous in one hot encoding error
I have below piece of code where i am trying use one hot encoder. But i get the the errorValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
from sklearn.preprocessing import LabelEncoder,…

Invictus
- 4,028
- 10
- 50
- 80
2
votes
1 answer
Is label encoding enough for output labels?
For ordinal features it makes sense to use label encoding. But for categorical features we use one hot encoding. But these are the conventions for input features. But for output variables is it necessary to use one hot encoding if the output labels…

hafiz031
- 2,236
- 3
- 26
- 48