I am referring to this Variational Autoencoder code in github for CelebA dataset. There is an encoder and decoder as usual. The encoder outputs x which is used to get the log variance and mean of the posterior distribution. Now the reparameterization trick gives us the encoded
z = sigma * u + mean
where u comes from normal distribution with mean = 0 and variance = 1. I can figure all of this in the code. However in the 'ScaleShift' class, which is a new keras layer, I don't understand why we are using add_loss(lodget). I checked the Keras documentation for writing your own layers and there doesn't seem to be a requirement for defining loss for a layer. I am confused by what this line's functionality is.
#new keras layer
class ScaleShift(Layer):
def __init__(self, **kwargs):
super(ScaleShift, self).__init__(**kwargs)
def call(self, inputs):
z, shift, log_scale = inputs
#reparameterization trick
z = K.exp(log_scale) * z + shift
#what are the next two lines doing?
logdet = -K.sum(K.mean(log_scale, 0))
self.add_loss(logdet)
return z
#gets the mean parameter(z_shift) from x, and similarly the log variance parameter(z_log_scale)
z_shift = Dense(z_dim)(x)
z_log_scale = Dense(z_dim)(x)
#keras lambda layer, spits out random normal variable u of same dimension as z_shift
u = Lambda(lambda z: K.random_normal(shape=K.shape(z)))(z_shift)
z = ScaleShift()([u, z_shift, z_log_scale])
x_recon = decoder(z)
x_out = Subtract()([x_in, x_recon])
recon_loss = 0.5 * K.sum(K.mean(x_out**2, 0)) + 0.5 * np.log(2*np.pi) * np.prod(K.int_shape(x_out)[1:])
#KL divergence loss
z_loss = 0.5 * K.sum(K.mean(z**2, 0)) - 0.5 * K.sum(K.mean(u**2, 0))
vae_loss = recon_loss + z_loss
vae = Model(x_in, x_out)
#VAE loss to be optimised
vae.add_loss(vae_loss)
vae.compile(optimizer=Adam(1e-4))