I was trying to check my implementation of PCA to see if I understood it and I tried to do PCA with 12 components on the MNIST data set (which I got using the tensorflow interface that normalized it for me). I obtained the principal components given by sklearn and then made reconstructions as follow:
pca = PCA(n_components=k)
pca = pca.fit(X_train)
X_pca = pca.transform(X_train)
# do manual PCA
U = pca.components_
my_reconstruct = np.dot( U.T , np.dot(U, X_train.T) ).T
then I used the reconstruction interface given by sklearn to try to reconstruct as follow:
pca = PCA(n_components=k)
pca = pca.fit(X_train)
X_pca = pca.transform(X_train)
X_reconstruct = pca.inverse_transform(X_pca)
and then checked the error as follow (since the rows are a data point and columns features):
print 'X_recon - X_my_reconstruct', (1.0/X_my_reconstruct.shape[0])*LA.norm(X_my_reconstruct - X_reconstruct)**2
#X_recon - X_my_reconstruct 1.47252586279
the error as you can see is non-zero and actually quite noticeable. Why is it? How is their reconstruction different from mine?