I need to apply ZCA whitening in PyTorch. I think I have found a way this can be done by using transforms.LinearTransformation and I have found a test in the PyTorch repo which gives some insight into how this is done (see final code block or link below)
https://github.com/pytorch/vision/blob/master/test/test_transforms.py
I am struggling to work out how I apply something like this myself.
Currently I have transforms along the lines of:
transform_test = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(np.array([125.3, 123.0, 113.9]) / 255.0,
np.array([63.0, 62.1, 66.7]) / 255.0),
])
The documents say they way to use LinearTransformation is as follows:
torchvision.transforms.LinearTransformation(transformation_matrix, mean_vector)
whitening transformation: Suppose X is a column vector zero-centered data. Then compute the data covariance matrix [D x D] with torch.mm(X.t(), X), perform SVD on this matrix and pass it as transformation_matrix.
I can see from the tests I linked above and copied below that they are using torch.mm to calculate what they call a principal_components:
def test_linear_transformation(self):
num_samples = 1000
x = torch.randn(num_samples, 3, 10, 10)
flat_x = x.view(x.size(0), x.size(1) * x.size(2) * x.size(3))
# compute principal components
sigma = torch.mm(flat_x.t(), flat_x) / flat_x.size(0)
u, s, _ = np.linalg.svd(sigma.numpy())
zca_epsilon = 1e-10 # avoid division by 0
d = torch.Tensor(np.diag(1. / np.sqrt(s + zca_epsilon)))
u = torch.Tensor(u)
principal_components = torch.mm(torch.mm(u, d), u.t())
mean_vector = (torch.sum(flat_x, dim=0) / flat_x.size(0))
# initialize whitening matrix
whitening = transforms.LinearTransformation(principal_components, mean_vector)
# estimate covariance and mean using weak law of large number
num_features = flat_x.size(1)
cov = 0.0
mean = 0.0
for i in x:
xwhite = whitening(i)
xwhite = xwhite.view(1, -1).numpy()
cov += np.dot(xwhite, xwhite.T) / num_features
mean += np.sum(xwhite) / num_features
# if rtol for std = 1e-3 then rtol for cov = 2e-3 as std**2 = cov
assert np.allclose(cov / num_samples, np.identity(1), rtol=2e-3), "cov not close to 1"
assert np.allclose(mean / num_samples, 0, rtol=1e-3), "mean not close to 0"
# Checking if LinearTransformation can be printed as string
whitening.__repr__()
How do I apply something like this? do I use it where I define my transforms or apply it in my training loop where I am iterating over my training loop?
Thanks in advance