ssim as custom loss function in autoencoder (keras or/and tensorflow)

Question

I am currently programming an autoencoder for image compression. From a previous post I have now final confirmation that I cannot use pure Python functions as loss functions neither in Keras nor in tensorflow. (And I am slowly beginning to understand why ;-)

I would like to do some experiments using the ssim as a loss function and as a metric. Now it seems I might be lucky. There is already an implementation of it in tensorflow, see: https://www.tensorflow.org/api_docs/python/tf/image/ssim

tf.image.ssim( img1, img2, max_val )

In addition, bsautermeister kindly provided an implementation here on stackoverflow: SSIM / MS-SSIM for TensorFlow.

My question now is: how would I use it as a loss function, with the mnist data set? The function does not accept a tensor but only two images. And, will the gradient be automatically computed? From what I understand it should if the function is implemented in tensorflow or keras backend.

I would be very gratefull for a minimum working example (MWE) on how to use any of the previously mentioned ssim implementations as a loss function either in keras or tensorflow.

Maybe we can use my MWE for an autoencoder provided with my previous question: keras custom loss pure python (without keras backend)

If it is not possible to glue my keras autoencoder together with the ssim implemenations would it be possible with an autoencoder directly implemented in tensorflow? I have that, too, and can provide it?

I am working with python 3.5, keras (with tensorflow backend) and if necessary tensorflow directly. Currently I am using the mnist dataset (the one with the numbers).

Thanks for any help!

(P.S.: Several people seem to be working on similar things. An answer to this post may also be useful for Keras - MS-SSIM as loss function)

Patwie · Accepted Answer · 2018-07-04T11:49:45.513

I cannot serve with Keras but in plain TensorFlow you just switch the L2 or whatever cost with the SSIM results like

import tensorflow as tf
import numpy as np


def fake_img_batch(*shape):
    i = np.random.randn(*shape).astype(np.float32)
    i[i < 0] = -i[i < 0]
    return tf.convert_to_tensor(np.clip(i * 255, 0, 255))


fake_img_a = tf.get_variable('a', initializer=fake_img_batch(2, 224, 224, 3))
fake_img_b = tf.get_variable('b', initializer=fake_img_batch(2, 224, 224, 3))

fake_img_a = tf.nn.sigmoid(fake_img_a)
fake_img_b = tf.nn.sigmoid(fake_img_b)

# costs = tf.losses.mean_squared_error(fake_img_a, fake_img_b, reduction=tf.losses.Reduction.MEAN)
costs = tf.image.ssim(fake_img_a, fake_img_b, 1.)
costs = tf.reduce_mean(costs)

train = tf.train.AdamOptimizer(0.01).minimize(costs)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(costs))
    for k in range(500):
        _, l = sess.run([train, costs])
        if k % 100 == 0:
            print('mean SSIM', l)

To check whether an operation has gradients (implemented) is straight-forward:

import tensorflow as tf
import numpy as np


def fake_img_batch(*shape):
    i = np.random.randn(*shape).astype(np.float32)
    i[i < 0] = -i[i < 0]
    return tf.convert_to_tensor(np.clip(i * 255, 0, 255))

x1 = tf.convert_to_tensor(fake_img_batch(2, 28, 28, 3))
x2 = tf.convert_to_tensor(fake_img_batch(2, 28, 28, 3))


y1 = tf.argmax(x1)  # not differentiable -> no gradients
y2 = tf.image.ssim(x1, x2, 255) # has gradients

with tf.Session() as sess:
    print(tf.gradients(y1, [x1]))  # will print [None] --> no gradient
    print(tf.gradients(y2, [x1, x2]))  # will print [<tf.Tensor 'gradients ...>, ...] --> has gradient

Firstly, many thanks. This is very helpful already. You clarified the gradient question for me. — Boris Reif, Jul 04 '18 at 14:08
Unfortunately, there is still something I don't quite understand. Using mnist=input_data.read_data_sets("MNIST_data", one_hot=True) and print(mnist.train.num_examples,mnist.test.num_examples,mnist.validation.num_examples) I get (55000,784) (55000,10) For the fake images you are using a proper tensor with (2, 224, 224, 3) (2, 224, 224, 3). But tf.image.ssim is only designed for images? What am I missing here? I cannot get the code to work with mnist. Can you clarify on how dimensions go together? — Boris Reif, Jul 04 '18 at 14:21
You can *only* apply SSIM to two images, so you have to reshape the 784-input to [28, 28, 1]. And please do [not forget](https://stackoverflow.com/help/someone-answers) — Patwie, Jul 04 '18 at 14:23
Thanks! I'll try again. I'll post the MWE as soon as I get it to work. — Boris Reif, Jul 04 '18 at 14:47
What is the function of the `tf.reduce_mean()` line here? Is the original `costs` giving a tensor of ssims across a batch, and the next line averaging them to a single batch mean ssim value? — Apollys supports Monica, Jul 26 '18 at 17:59

ssim as custom loss function in autoencoder (keras or/and tensorflow)

1 Answers1