How can I get a DF back from a sklearn encoding/scaling pipeline?

Question

I am trying to encode and scale my datafame using sklearns pipelines. It's just returning a numpy array instead of a dataframe. Instead of making a hacky solution(which I am best at!), I was hoping there was a easier/standard way to get an encoded/scaled dataframe back.

Here's a sample of the code I'm trying to encode/scale :

from sklearn.preprocessing import OneHotEncoder
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler
from sklearn.impute import SimpleImputer


num_attributes = list(train_set.select_dtypes(exclude=['object'])) #to select all num columns, we exclude any column with object types
cat_attributes = list(train_set.select_dtypes(include=['object'])) #here we select all columns with object types

cat_pipeline = Pipeline([ 
    ('imputer', SimpleImputer(fill_value='none', strategy='constant')),
    ('one_hot', OneHotEncoder())
    ])

full_pipeline = ColumnTransformer([
    ('num', StandardScaler(), num_attributes),
    ('cat', cat_pipeline, cat_attributes)
])

train_set_prepared = full_pipeline.fit_transform(train_set)

Result is numpy array:

  (0, 0)    nan
  (0, 1)    -0.002676506826924531
  (0, 2)    nan
  (0, 3)    -0.03350622836892517
  (0, 4)    nan
  (0, 5)    -0.03294496247236749
  (0, 6)    0.002534826949104915

Is there a way to transform it easily back into a datafame that is scaled/encoded?

does [this](https://stackoverflow.com/a/54045636/9243482) help — Ando, Aug 27 '20 at 16:02
@YukiShioriii I tried to wrap my command with the command in the answer and got this error - ValueError: Shape of passed values is (490546, 1), indices imply (490546, 110) - I did this command - df_scaled = pd.DataFrame(full_pipeline.fit_transform(train_set),columns = train_set.columns) — Lostsoul, Aug 27 '20 at 16:07
It didn't work, either, everything is in one column - 0 (0, 0)\tnan\n (0, 1)\t-0.002676506826924531... 1 (0, 0)\tnan\n (0, 1)\t-0.002676506826924531... 2 (0, 0)\tnan\n (0, 1)\t-0.002676506826924531... 3 (0, 0)\tnan\n (0, 1)\t-0.002676506826924531... — Lostsoul, Aug 27 '20 at 16:20
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/220568/discussion-between-yukishioriii-and-lostsoul). — Ando, Aug 27 '20 at 16:24

How can I get a DF back from a sklearn encoding/scaling pipeline?

0 Answers0