5

I have written a rather complex loss function for a Keras model and it keeps returning nan while training. Therefore, I need to print the intermediate tensors while training. I understand that you cannot do K.eval in your loss function because the tensors are not initialized. However, I have tried both K.print_tensor() and tf.Print() and neither work.

Pretty much I want to do something like this:

def mean_squared_error(y_true, y_pred):
    print("mean_squared_error")
    loss = K.mean(K.square(y_pred - y_true), axis=-1)
    loss = tf.Print(loss, [loss])
    return loss
model.compile(optimizer=self.optimizer, loss=mean_squared_error)

In practice, I would replace mean_squared_error with my custom loss. "mean_squared_error" would get printed, but not the values I try to print using TensorFlow print (nor Keras print). I also tried the exact same code as in How do I print inside the loss function during training in Keras? and I still don't see anything getting printed in the console.

In addition, I have written a separate file to test something.

import tensorflow as tf
import keras.backend as K

input1 = K.constant(1)
input2 = K.constant(2)
input3 = K.constant(3)

node1 = tf.add(input1, input2)
print_output = K.print_tensor(node1)
output = tf.multiply(print_output, input3)

Nothing gets printed either.

Am I using TensorFlow's Print and Keras print_tensor wrongly? Or are the results printed elsewhere? I have tried to test for my console's stderr using print("test", file=sys.stderr) and got the correct output test.

For clarification, I know that you can use K.eval to make the test code print out values of the tensor, but since I cannot use K.eval in my loss function, I need to make tf.Print or K.print_tensor work.

nbro
  • 15,395
  • 32
  • 113
  • 196
Leo Appleseed
  • 105
  • 2
  • 8
  • Note to future self: K.print_tensor not printing in Jupyter notebook. Fine when the script is run from console. – winterlight Aug 30 '19 at 07:41

4 Answers4

1

The issue here is that the training code often does not actually depend on the value of the loss tensor! Usually you can compute the gradient of a loss without ever computing the actual value of the loss, and this means tensorflow's runtime is free to prune the actual execution of the loss from the graph.

You can wrap your loss function in a tf.contrib.eager.defun decorator, which has the side effect of guaranteeing that all stateful ops in your function run even if they are not needed by the backward pass.

Alexandre Passos
  • 5,186
  • 1
  • 14
  • 19
0

You will have to use tf.InteractiveSession if you want to run ops and print results without passing a session -- see details here

So you test code will print node1 value if changed as follows:

import tensorflow as tf
import keras.backend as K

input1 = K.constant(1)
input2 = K.constant(2)
input3 = K.constant(3)

node1 = tf.add(input1, input2)
print_output = K.print_tensor(node1)
output = tf.multiply(print_output, input3)
sess = tf.InteractiveSession()
print("node1: ", node1.eval())
sess.close()
bigdata2
  • 999
  • 1
  • 11
  • 17
  • Thanks for the reply! Well the snippet you are referring to is only meant as a test for tf.Print. I really want it to print things in the loss function, and I am not able to use .eval() in loss functions as the tensors are not initialized when the loss function is being called. – Leo Appleseed Aug 26 '18 at 04:31
  • node.eval() with tf.InteractiveSession() or print(sess.run(node)) when a session is passed will print the result of any operation including the loss function that you mentioned in your code. It seems that you have some other problems in your code. If you want help, first you should post the entire code, and second, do not post test-code and questions regarding test-code and upon receiving an answer say "I really want [something else...]". – bigdata2 Aug 26 '18 at 06:51
  • I have tried this inside a loss function and it does not work. I would get a "tensor not initialized error." I believe that loss functions are passed into the model during model.compile, with which tensorflow creates the computation graph. The point of using tf.Print is to add another print node so that when actual values flow through the tensors they will be printed to console at every epoch. Thank you for your answer though. – Leo Appleseed Aug 26 '18 at 08:50
  • You should learn how to properly pass training data into a model using tensorflow first. Go over this example https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py for details. The same example also shows how to print results of operations on graph nodes and how to use tf_debugger to print value of any graph node. – bigdata2 Aug 26 '18 at 09:28
0

Your code builds a graph:

import tensorflow as tf
import keras.backend as K

input1 = K.constant(1)
input2 = K.constant(2)
input3 = K.constant(3)

node1 = tf.add(input1, input2)
print_output = K.print_tensor(node1)
output = tf.multiply(print_output, input3)

In order to run the graph, you need to define a Session environment in which Operation objects are executed, and Tensor objects are evaluated:

sess = tf.Session()

To evaluate the tensor output:

sess.run(output)

Finally, release the resources:

sess.close()

Your code just defines the graph. There is no Session and no evaluation operation.

0

In TensorFlow 2, similarly to the solution suggested in this answer, you can decorate your loss function with @tf.function.

nbro
  • 15,395
  • 32
  • 113
  • 196