My dataset is a Numpy array with dimensions (N, W, H, C), where N is the number of images, H and W are height and width respectively and C is the number of channels.
I know that there are many tools out there but I would like to normalize the images with only Numpy.
My plan is to compute the mean and standard deviation across the whole dataset for each of the three channels and then subtract the mean and divide by the standard deviation.
Suppose we have two images in the dataset and and the first channel of those two images looks like this:
x=array([[[3., 4.],
[5., 6.]],
[[1., 2.],
[3., 4.]]])
Compute the mean:
numpy.mean(x[:,:,:,0])
= 3.5
Compute the std:
numpy.std(x[:,:,:,0])
= 1.5
Normalize the first channel:
x[:,:,:,0] = (x[:,:,:,0] - 3.5) / 1.5
Is this correct?
Thanks!