How to apply StandardScaler to a single column?

Question

I need to apply StandardScaler of sklearn to a single column col1 of a DataFrame:

df:

col1  col2  col3
1     0     A
1     10    C
2     1     A
3     20    B

This is how I did it:

from sklearn.preprocessing import StandardScaler

def listOfLists(lst):
    return [[el] for el in lst]

def flatten(t):
    return [item for sublist in t for item in sublist]

scaler = StandardScaler()

df['col1'] = flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist())))

However, then I apply the inverse_transform, then it does not give me initial values of col1. Instead it returns the normalised values:

scaler.inverse_transform(flatten(scaler.fit_transform(listOfLists(df['col1'].to_numpy().tolist()))))

or:

scaler.inverse_transform(df['col1'])

score 3 · Accepted Answer · answered Mar 02 '22 at 19:14

3

You could fit a scaler directly on the column (since the scaler is expecting a 2D array, you can select the column as a DataFrame by df[['col1']]):

scaler = StandardScaler()
>>> arr = scaler.fit_transform(df[['col1']]).flatten()
array([-0.90453403, -0.90453403,  0.30151134,  1.50755672])

>>> scaler.inverse_transform(arr)
array([1., 1., 2., 3.])

answered Mar 02 '22 at 19:14

The first part works well and simplifies my approach, which is nice. But the inverse transform still gives me normalized values rather original values of `col1`. – Fluxy Mar 02 '22 at 19:17
@Fluxy maybe it's the version of scikit-learn? Because your code returns the correct initial data too. – Mar 02 '22 at 19:18
I use the version 0.24.1 – Fluxy Mar 02 '22 at 19:19
@Fluxy could you try without flattening: `scaler.inverse_transform(scaler.fit_transform(df[['col1']]))` – Mar 02 '22 at 19:19
1

yes, it works well without flatten. thanks. – Fluxy Mar 02 '22 at 19:24

How to apply StandardScaler to a single column?

1 Answers1