2

How can I apply a sklearn scaler to all rows of a pandas dataframe. The question is related to pandas dataframe columns scaling with sklearn. How can I apply a sklearn scaler to all values of a row?

NOTE: I know that for feature scaling it's normal to have features in columns and scaling features column wise like in the refenced other question. However I'd like to use sklearn scalers for preprocessing data for visualization where it's reasonable to scale row wise in my case.

thinwybk
  • 4,193
  • 2
  • 40
  • 76

1 Answers1

1

Sklearn works both with panda dataframes and numpy arrays, and numpy arrays allow some basic matrix transformations when dataframes don't.

You can transform the dataframe to a numpy array, vectors = df.values. Then transpose the array, scale the transposed array columnwise, transpose it back

scaled_rows = scaler.fit_transform(vectors.T).T

and convert it to dataframe scaled_df = pd.DataFrame(data = scaled_rows, columns = df.columns)

Alexis Benichoux
  • 790
  • 4
  • 13
  • 1
    It would be helpful to expand on this explanation a little bit. – kaya3 Nov 13 '19 at 18:07
  • Right. The assignment of the scaled values to the dataframe is missing. – thinwybk Nov 14 '19 at 07:26
  • @Alexis You probably meant `df = pd.DataFrame(columns = dfTest.columns, data = scaler.fit_transform(dfTest.values.T).T)`, right? If you update the answer I'll accept it. – thinwybk Nov 14 '19 at 07:39