I write own normalized module, because I seem sklearn
don't normalize all data together (only per column or row). And I have two codes.
First code with sklearn
.
from sklearn import preprocessing
data = np.array([[-1], [-0.5], [0], [1], [2], [6], [10], [18]])
print(data)
scaler = preprocessing.MinMaxScaler(feature_range=(5, 10))
print(scaler.fit_transform(data))
print(scaler.inverse_transform(scaler.fit_transform(data)))
Result:
[[-1. ]
[-0.5]
[ 0. ]
[ 1. ]
[ 2. ]
[ 6. ]
[10. ]
[18. ]]
[[ 5. ]
[ 5.13157895]
[ 5.26315789]
[ 5.52631579]
[ 5.78947368]
[ 6.84210526]
[ 7.89473684]
[10. ]]
[[-1. ]
[-0.5]
[ 0. ]
[ 1. ]
[ 2. ]
[ 6. ]
[10. ]
[18. ]]
And with my module:
data = np.array([[-1, 2], [-0.5, 6], [0, 10], [1, 18]])
print(data)
scaler = scl.Scaler(feature_range=(5, 10))
print(scaler.transform(data))
print(scaler.inverse_transform(scaler.transform(data)))
Result:
[[-1. 2. ]
[-0.5 6. ]
[ 0. 10. ]
[ 1. 18. ]]
[[ 5. 5.78947368]
[ 5.13157895 6.84210526]
[ 5.26315789 7.89473684]
[ 5.52631579 10. ]]
[[-1.00000000e+00 2.00000000e+00]
[-5.00000000e-01 6.00000000e+00]
[ 1.33226763e-15 1.00000000e+01]
[ 1.00000000e+00 1.80000000e+01]]
I guess 1.33226763e-15
don't suit for me.
I think it occur because there is floating point. Although sklearn
don't have this problem.
Please tell me where do I do mistake?
import numpy as np
class Scaler:
def __init__(self, feature_range: tuple = (0, 1)):
self.scaler_min = feature_range[0]
self.scaler_max = feature_range[1]
self.data_min = None
self.data_max = None
def transform(self, x: np.ndarray):
self.data_min = x.min(initial=0)
self.data_max = x.max(initial=0)
scaled_data = (x - x.min(initial=0)) / (x.max(initial=0) - x.min(initial=0))
return scaled_data * (self.scaler_max - self.scaler_min) + self.scaler_min
def inverse_transform(self, x: np.ndarray):
scaled_data = (x - self.scaler_min) / (self.scaler_max - self.scaler_min)
return scaled_data * (self.data_max - self.data_min) + self.data_min