I have been looking at autoencoders and have been wondering whether to used tied weights or not. I intend on stacking them as a pretraining step and then using their hidden representations to feed a NN.
Using untied weights it would look like:
f(x)=σ2(b2+W2*σ1(b1+W1*x))
Using tied weights it would look like:
f(x)=σ2(b2+W1T*σ1(b1+W1*x))
From a very simplistic view, could one say that tying the weights ensures that encoder part is generating the best representation given the architecture versus if the weights were independent then decoder could effectively take a non-optimal representation and still decode it?
I ask because if the decoder is where the "magic" occurs and I intend to only use the encoder to drive my NN, wouldn't that be problematic.