2

I was trying to understand differences between OneHotEncoder and get_dummies from this link: enter link description here

When I wrote exact same code, I am getting an error and it says

AttributeError: 'OneHotEncoder' object has no attribute 'get_feature_names_out'

Here is the code:

import pandas as pd
import seaborn as sns
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

df = sns.load_dataset('tips')
df = df[['total_bill', 'tip', 'day', 'size']]

df.head(5)

X = df.drop('tip', axis=1)
y = df['tip']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

ohe = OneHotEncoder(handle_unknown='ignore', sparse=False, dtype='int')
ohe.fit(X_train[['day']])

def get_ohe(df):
    temp_df = pd.DataFrame(data=ohe.transform(df[['day']]), columns=ohe.get_feature_names_out())
    df.drop(columns=['day'], axis=1, inplace=True)
    df = pd.concat([df.reset_index(drop=True), temp_df], axis=1)
    return df

X_train = get_ohe(X_train)
X_test = get_ohe(X_test)

X_train.head()

I checked OneHotEncoder from sklearn.preprocessing module and get_feature_names_out() method is there and it is not deprecated. I don't know why I am getting this error.

mehmet_zmn
  • 31
  • 1
  • 4
  • What's your sklearn version? The release 1.1.0 (released a couple of days ago) provides `.get_feature_names_out()` to all transformers; with version 1.0.2 I'm able to run this code w/o any issue. Eventually, try to have a look at https://stackoverflow.com/questions/70993316/get-feature-names-after-sklearn-pipeline/71048229#71048229 and related links – amiola May 17 '22 at 18:14
  • I checked my sklearn version and it is 0.0. Then I searched why it is like this, sklearn is not the actual module, the real one is scikit-learn. I didn't know that, because I am still learning. Scikit-learn version was 0.24.0, old version problem. I upgraded it and it is solved. Thanks a lot. – mehmet_zmn May 17 '22 at 19:00

1 Answers1

2

If you're using scikit-learn version lower than 1.0, you need to use get_feature_names method. For newer versions of scikit-learn, get_feature_names_out will work fine.

Alex Bochkarev
  • 2,851
  • 1
  • 18
  • 32