1

I personalized my own Huber loss function in the way (https://goodboychan.github.io/python/coursera/tensorflow/deeplearning.ai/2022/02/08/01-Tensorflow2-Custom-Loss-Function.html) suggests:

def my_huber_loss(y_true, y_pred):
    threshold = 1.
    error = y_true - y_pred
    is_small_error = tf.abs(error) <= threshold
    small_error_loss = tf.square(error) / 2
    big_error_loss = threshold * (tf.abs(error) - threshold / 2)
    return tf.where(is_small_error, small_error_loss, big_error_loss)

I included it in model.compile(optimizer='adam', loss=my_huber_loss, metrics=['mae']) and training works fine.

Now, I would like to know how many times we call this Huber loss through the training phase, so I did as is there a way to track the number of times a function is called? suggests:

def my_huber_loss(y_true, y_pred):
    threshold = 1.
    error = y_true - y_pred
    is_small_error = tf.abs(error) <= threshold
    small_error_loss = tf.square(error) / 2
    big_error_loss = threshold * (tf.abs(error) - threshold / 2)
    my_huber_loss.counter +=1 #THIS IS THE NEW LINE
    return tf.where(is_small_error, small_error_loss, big_error_loss)

my_huber_loss.counter = 0 #INITIALIZE

However, after the whole training print(my_huber_loss.counter) outputs 3:

results = model.fit(X_train, Y_train, validation_split=0.1, batch_size=1, epochs=numEpochs, callbacks=[earlystopper])
print(my_huber_loss.counter)

Prints 3.

I know this number is not correct, since loss functions should be called more times. In addition, I added the tf.print("--- Called Loss ---") line in my_huber_loss(), and I can see how we call it several times, e.g.:

Epoch 1/2
--- Called Loss ---
   1/1440 [..............................] - ETA: 56:15 - loss: 0.0411 - mae: 0.2357--- Called Loss ---
--- Called Loss ---
   3/1440 [..............................] - ETA: 47s - loss: 0.0398 - mae: 0.2291  --- Called Loss ---
--- Called Loss ---
   5/1440 [..............................] - ETA: 45s - loss: 0.0338 - mae: 0.2096--- Called Loss ---
--- Called Loss ---
   7/1440 [..............................] - ETA: 46s - loss: 0.0338 - mae: 0.2110--- Called Loss ---
--- Called Loss ---
   9/1440 [..............................] - ETA: 44s - loss: 0.0306 - mae: 0.1997--- Called Loss ---
--- Called Loss ---
  11/1440 [..............................] - ETA: 43s - loss: 0.0279 - mae: 0.1893--- Called Loss ---
--- Called Loss ---
  13/1440 [..............................] - ETA: 41s - loss: 0.0265 - mae: 0.1836--- Called Loss ---
--- Called Loss ---
  15/1440 [..............................] - ETA: 41s - loss: 0.0261 - mae: 0.1824--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
  18/1440 [..............................] - ETA: 39s - loss: 0.0250 - mae: 0.1783--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
  21/1440 [..............................] - ETA: 38s - loss: 0.0243 - mae: 0.1764--- Called Loss ---
...

What is going wrong? How can I count the number of times I call a loss function?

Theo Deep
  • 666
  • 4
  • 15

1 Answers1

0

I would personally use sub-classing to preserve the state. Regardless, the issue you are facing is due to using Python side effects. The TensorFlow documentation explains why one should not rely on Python side effects like object mutation or list appends. Instead, it is recommended to use the TensorFlow operations such as tf.print, assign_add, etc.

Here is one way to fix the issue using a graph-compatible TensorFlow code:

def my_huber_loss(y_true, y_pred):
    threshold = 1.
    error = y_true - y_pred
    is_small_error = tf.abs(error) <= threshold
    small_error_loss = tf.square(error) / 2
    big_error_loss = threshold * (tf.abs(error) - threshold / 2)
    my_huber_loss.counter.assign_add(1) #graph-compatible tf
    tf.print("--- Called Loss ---")
    return tf.where(is_small_error, small_error_loss, big_error_loss)

A simple model:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1,])
])

batch_size=1: The loss function is called 6 times per epoch (data set consisting of 6 examples).

my_huber_loss.counter = tf.Variable(0, trainable=False) #INITIALIZE
model.compile(optimizer='sgd', loss=my_huber_loss)

results = model.fit(xs, ys, batch_size=1, epochs=2, verbose=0)
print(my_huber_loss.counter)

Output:

--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
--- Called Loss ---
<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=12>

batch_size=10: leading to only one batch of 6 examples. The loss function is called once per epoch.

my_huber_loss.counter = tf.Variable(0, trainable=False) #INITIALIZE
model.compile(optimizer='sgd', loss=my_huber_loss)

results = model.fit(xs, ys, batch_size=10, epochs=2, verbose=0)
print(my_huber_loss.counter)

Output:

--- Called Loss ---
--- Called Loss ---
<tf.Variable 'Variable:0' shape=() dtype=int32, numpy=2>
learner
  • 603
  • 1
  • 3
  • 15
  • Hey, this totally solved my problem, thanks a lot!! Now I have to read what type of data is `tf.Variable` in order to better understand this solution. I also read the link you offered and read, as you said, that: "Side effects, such as printing, adding to lists and global mutating, may behave in unexpected ways within a Function. ". One example that is mentioned is `Counter` itself. But it does not explain why this "weird side effect behaviour" happens. Do you have a snapshot of the reason? Thanks a lot again!! – Theo Deep Jan 11 '23 at 14:41
  • 1
    First, please note that the statement "Side effects, like printing, appending to lists, and mutating globals, can behave unexpectedly inside a Function" from https://www.tensorflow.org/guide/function#executing_python_side_effects does not suggest a random behavior. Here, "behave unexpectedly" simply means that one may expect a specific output based on their Python experience, but they would may get a different output because the tensorflow graph programming is not the same as Python language. There are two phases: tracing and execution. – learner Jan 11 '23 at 15:12
  • 1
    It can become much more clear after reading about the **tracing** concept from this subsection: https://www.tensorflow.org/guide/function#tracing Specifically regarding your question, changing Python global and free variables counts as a Python side effect, so it only happens during tracing. https://www.tensorflow.org/guide/function#changing_python_global_and_free_variables – learner Jan 11 '23 at 15:13