In variational autoencoder, objective function has two terms, the one which makes input and output x same, and the other one regularizer, q(z) and p(z) to be close by KL divergence. What I doN't understand is why we can assume that p(z)~Normal Gaussian with 0 mean and 1 variance?
Why not say..variance less than 1? so that more informationn is condensed with narrower gaussians in hidden layer?
Thank you