Questions tagged [label-encoding]

Label Encoding refers to converting categorical labels in a data set used for machine learning purposes, into numeric form. Machine learning algorithms can then decide in a better way on how those labels must be operated. It is an important pre-processing step for a structured data set in supervised learning.

119 questions
33
votes
3 answers

How to apply LabelEncoder for a specific column in Pandas dataframe

I have a dataset loaded by dataframe where the class label needs to be encoded using LabelEncoder from scikit-learn. The column label is the class label column which has the following classes: [‘Standing’, ‘Walking’, ‘Running’, ‘null’] To perform…
Kristofer
  • 1,457
  • 2
  • 19
  • 27
7
votes
3 answers

Why should LabelEncoder from sklearn be used only for the target variable?

I was trying to create a pipeline with a LabelEncoder to transform categorical values. cat_variable = Pipeline(steps = [ ('imputer',SimpleImputer(strategy = 'most_frequent')), ('lencoder',LabelEncoder()) ]) …
5
votes
1 answer

Use same Label Encoder for train and test dataframes

I have 2 different csv which has a train data and test data. I created two different dataframes from these train_features_df and test_features_df. Note that the test and train data have multiple categorical columns, so i need to apply labelEncoder…
Invictus
  • 4,028
  • 10
  • 50
  • 80
4
votes
1 answer

Value Error: y contains previously unseen labels:

I've used Decision Tree Classifier and I want to enter my input as a string rather than giving an integer value, but it gives me error like: Traceback (most recent call last): File "D:/backup code for odoo project/New folder/New folder/main.py",…
4
votes
2 answers

How do I create a function to perform label encoding

I have the dataframe - df = pd.DataFrame({'colA':['a', 'a', 'a', 'b' ,'b'], 'colB':['a', 'b', 'a', 'c', 'b'], 'colC':['x', 'x', 'y', 'y', 'y']}) I would like to write a function to replace each value with it's frequency count in that column. For…
The Rookie
  • 877
  • 8
  • 15
3
votes
2 answers

How to encode a dataset having multiple datatypes?

I have a dataset like: e = pd.DataFrame({ 'col1': ['A', 'A', 'B', 'W', 'F', 'C'], 'col2': [2, 1, 9, 8, 7, 4], 'col3': [0, 1, 9, 4, 2, 3], 'col4': ['a', 'B', 'c', 'D', 'e', 'F'] }) Here I encoded the data using…
3
votes
0 answers

LabelEncoder: ValueError- y contains previously unseen labels:

I'm using random forest for prediction i want to what is wrong in y code and is the encoding done correctly ` import warnings import pandas as pd from sklearn.ensemble import RandomForestRegressor from sklearn.preprocessing import…
3
votes
0 answers

Parameter of OneHotEncoder : Categories

I have been coding on ML via Scikit-learn from few months. but a update has came on scikit object of preprocessing which is OneHotEncoder. here was a parameter categorical_features which is now changed to categories and now i am not understanding…
2
votes
2 answers

Label encode subgroups after groupby

I want to label encode subgroups in a pandas dataframe. Something like this: | Category | | Name | | ---------- | | --------- | | FRUITS | | Apple | | FRUITS | | Orange | | FRUITS | | Apple | | Vegetables | | Onion …
2
votes
0 answers

Reversing Sci-Kit LabelEncoder, but have a 2D array dataset

I'm trying to create an automated data pre-processing library and I want to transform the string data into numerical so it can be ran through ML algorithms. But I can't seem to reverse it back to its original state, which should be relatively simple…
2
votes
2 answers

LabelEncoder().fit_transform gives me negative values?

Hei, I have different city names in the column "City" in my dataset. I would love to encode it using LabelEncoder(). However, I got quite frustrating results with negative values df['city_enc'] =…
2
votes
1 answer

convert data with LabelEncoder

I wrote this function to convert categorical features with LabelEncoder #convert columns to dummies with LabelEncoder cols = ['ToolType', 'TestType', 'BatteryType'] #apply ene hot encoder le = LabelEncoder() for col in cols: data[col] =…
apham15
  • 89
  • 5
2
votes
1 answer

Using a LabelEncoder in sklearn's Pipeline gives: fit_transform takes 2 positional arguments but 3 were given

I've been trying to run some ML code but I keep faltering at the fitting stage after running my pipeline. I've looked around on various forums to not much avail. What I've discovered is that some people say you can't use LabelEncoder within a…
user14046737
2
votes
1 answer

ValueError: The truth value of a Series is ambiguous in one hot encoding error

I have below piece of code where i am trying use one hot encoder. But i get the the errorValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). from sklearn.preprocessing import LabelEncoder,…
Invictus
  • 4,028
  • 10
  • 50
  • 80
2
votes
1 answer

Is label encoding enough for output labels?

For ordinal features it makes sense to use label encoding. But for categorical features we use one hot encoding. But these are the conventions for input features. But for output variables is it necessary to use one hot encoding if the output labels…
1
2 3 4 5 6 7 8