I want to whiten the CIFAR10 dataset using ZCA. The input X_train
is of shape (40000, 32, 32, 3) where 40000 is the number of images, and 32x32x3 is the size of each image. I'm using the code from this answer for this purpose:
X_flat = np.reshape(X_train, (-1, 32*32*3))
# compute the covariance of the image data
cov = np.cov(X_flat, rowvar=True) # cov is (N, N)
# singular value decomposition
U,S,V = np.linalg.svd(cov) # U is (N, N), S is (N,)
# build the ZCA matrix
epsilon = 1e-5
zca_matrix = np.dot(U, np.dot(np.diag(1.0/np.sqrt(S + epsilon)), U.T))
# transform the image data zca_matrix is (N,N)
zca = np.dot(zca_matrix, X_flat) # zca is (N, 3072)
However, at run time I encountered the following warning:
D:\toolkits.win\anaconda3-5.2.0\envs\dlwin36\lib\site- packages\ipykernel_launcher.py:8: RuntimeWarning: invalid value encountered in sqrt
So after I got the SVD output, I tried:
print(np.min(S)) # prints -1.7798217
Which is unexpected because S
can only have positive values. Also, the ZCA whitening result was not correct and it contained nan
values.
I tried reproducing this by re-running this same code a second time and this time I did not encounter any warnings or any negative S
values, but instead I got:
print(np.min(S)) # prints nan
Any idea for why this might have happened?
Update: Restarted the kernel to free up cpu and RAM resources, and tried running this code again. Again got the same warning for feeding in negative values to np.sqrt()
. Not sure if it helps but I've also attached the cpu and ram utilization figures: