Custom Function Use Case
Let's say you have image arrays with a known value range between 0-255
that you want to scale down between 0-1
, but you don't want to use StandardScaler
because not all images would have values of 0 and 255 in them. In simpler terms. No one scored a 100% on the test, but you still want to scale between 0-100.
from sklearn.preprocessing import FunctionTransformer
import numpy as np
data = np.array([[100, 2], [240, 80], [139, 10], [10, 150]])
def div255(X): return X/255 #encode
def mult255(X): return X*255 #decode
scaler = FunctionTransformer(div255, inverse_func=mult255)
# --- encode ---
mutated = scaler.fit_transform(data)
"""
array([[0.39215686, 0.00784314],
[0.94117647, 0.31372549],
[0.54509804, 0.03921569],
[0.03921569, 0.58823529]])
"""
# --- decode ---
scaler.inverse_transform(mutated)
"""
array([[100., 2.],
[240., 80.],
[139., 10.],
[ 10., 150.]])
"""
Pro Tip
Make sure you define these custom functions in a place where they can be referenced by the rest of your program (e.g. helper functions). Especially for when it comes time to inverse_transform
your predictions and/or encode new samples!