StandardScaler
scales your data so that each column will have μ = 0 and σ = 1. According to the documentation:
Centering and scaling happen independently on each feature by
computing the relevant statistics on the samples in the training set.
As each feature is scaled independently of the other(s), their relevant magnitude difference does not over-shadow each other. It is worth noting that the scaling depends heavily on distribution of training samples for each feature. A standard normally distributed training data will result in perfect scaling. For further understanding you may go through the documentation and also see this SO thread.
Sample data is scaled in the following code:
from sklearn.preprocessing import StandardScaler
import numpy as np
data = np.array([[100, 0, 2], [66, 1, 5], [50, 0, 4], [33, 1, 1], [0, 0, 3], [25, 0, 2], [75, 1, 4], [50, 1, 3]])
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)
print ("Scaled Array:")
print(scaled_data)
print('Mean :', scaled_data.mean(axis=0))
print('Standard Deviation :', scaled_data.std(axis=0))
print('Variance :', scaled_data.var(axis=0))
The output is:
Scaled Array:
[[ 1.71992157 -1. -0.81649658]
[ 0.55329148 1. 1.63299316]
[ 0.00428908 -1. 0.81649658]
[-0.57902597 1. -1.63299316]
[-1.71134341 -1. 0. ]
[-0.85352716 -1. -0.81649658]
[ 0.86210533 1. 0.81649658]
[ 0.00428908 1. 0. ]]
Mean : [3.46944695e-18 0.00000000e+00 0.00000000e+00]
Standard Deviation : [1. 1. 1.]
Variance : [1. 1. 1.]