1

I have the following dataframe df;

       Name Date        Attr1   Attr2    Attr3 
0      Joe  26-12-2007  53.45  53.4500  52.7200 
1      Joe  27-12-2007  52.38  52.7399  51.0200 
2      Joe  28-12-2007  51.71  51.8500  50.7300 

I would like to scale the floating values in columns Attr1, Attr2, Attr3 to between 0 and 1. The highest value in a column is scaled to 1. Please note that not all the columns are to be scaled.

I am using Python 3.6.

The following code will scale all the columns but I need to scale selected columns. Another problem is that some columns are in date and string form. The code below will encounter problems converting the values to floating.

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler() 
scaled_values = scaler.fit_transform(df) 
df.loc[:,:] = scaled_values
MarianD
  • 13,096
  • 12
  • 42
  • 54
user3848207
  • 3,737
  • 17
  • 59
  • 104

1 Answers1

1

Solution is:

In:

import pandas as pd

data = pd.DataFrame({'Name':['John','Sara','Martin'],'first':[53.45, 55.51, 51.22],'second':[51.45, 54.51, 57.22],'third':[50.45, 54.51, 58.22]})
data

Out:

    Name    first   second  third
0   John    53.45   51.45   50.45
1   Sara    55.51   54.51   54.51
2   Martin  51.22   57.22   58.22

In:

from sklearn.preprocessing import MinMaxScaler

sclr = MinMaxScaler()
new_df = sclr.fit_transform(data[['first', 'second', 'third']])

Out:

array([[ 0.51981352,  0.        ,  0.        ],
       [ 1.        ,  0.53032929,  0.52252252],
       [ 0.        ,  1.        ,  1.        ]])
Alex
  • 798
  • 1
  • 8
  • 21