0

I have problem with my custom_loss function.

I have y_true and y_pred data in (200, 50, 1) shape.
But I need to ignore first element for my loss function. So I make local y_pred and y_true without first element of each vector and y_true_new/y_pred_new shape are (200, 49, 1).
After that I need to find their abs difference. Then I need to step it with condition:
differ < 0.5 => 0
differ >= 0.5 => 1
then I try cast this bool tensor to float32 and sum theese numbers. And I need to minimize that value.
But I have an error after K.cast() in runtime:

Traceback (most recent call last):
  File "/home/bocharick/HDD/UbuntuFiles/PycharmProjects/punctuation/bin/keras_punctuator_train.py", line 195, in <module>
    model.fit(X, Y, epochs=1, batch_size=BATCH_SIZE, verbose=1, callbacks=[tensorboard, save_model_weights, save_pretrain_model])
  File "/usr/local/lib/python3.5/dist-packages/keras/models.py", line 867, in fit
    initial_epoch=initial_epoch)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1575, in fit
    self._make_train_function()
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 960, in _make_train_function
    loss=self.total_loss)
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/optimizers.py", line 432, in get_updates
    m_t = (self.beta_1 * m) + (1. - self.beta_1) * g
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 856, in binary_op_wrapper
    y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 611, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 676, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 121, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/constant_op.py", line 102, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/tensor_util.py", line 364, in make_tensor_proto
    raise ValueError("None values not supported.")
ValueError: None values not supported.

My custom_loss function is:

BATCH_SIZE = 200
MAX_SEQUENCE_LEN = 50

    def custom_loss(y_true, y_pred):
        y_true_new = K.reshape(y_true, (BATCH_SIZE, MAX_SEQUENCE_LEN))
        y_true_new = K.transpose(y_true_new)
        y_true_new = y_true_new[1:]
        y_true_new = K.transpose(y_true_new)
        y_true_new = K.reshape(y_true_new, (BATCH_SIZE, MAX_SEQUENCE_LEN-1, 1))

        y_pred_new = K.reshape(y_pred, (BATCH_SIZE, MAX_SEQUENCE_LEN))
        y_pred_new = K.transpose(y_pred_new)
        y_pred_new = y_pred_new[1:]
        y_pred_new = K.transpose(y_pred_new)
        y_pred_new = K.reshape(y_pred_new, (BATCH_SIZE, MAX_SEQUENCE_LEN-1, 1))

        diff = y_true_new - y_pred_new
        #print(1, diff.shape, diff)
        diff = K.abs(diff)
        #print(2, diff.shape, diff)
        diff = K.greater(diff, 0.5)
        #print(3, diff.shape, diff)
        diff = K.cast(diff, 'float32')
        #print(4, diff.shape, diff)

        return K.sum(diff)

My model is:

model = Sequential()
model.add(Bidirectional(LSTM(128, return_sequences=True), input_shape=(MAX_SEQUENCE_LEN, 1)))
model.add(TimeDistributed(Dense(1, activation="sigmoid")))
Bocharick
  • 71
  • 5
  • 1
    If neither `BATCH_SIZE` nor `MAX_SEQUENCE` are `None`, then you probably got this problem because your loss function doesn't have any gradient. (The function "greater" cannot be derived, you need actual continuous functions in the result). – Daniel Möller Sep 25 '17 at 18:14
  • You should probably stop in `K.abs(diff)` and add a `K.mean(diff)`. – Daniel Möller Sep 25 '17 at 18:15
  • I need to "normalize" error, because if y_pred value is 4 instead of 1 or maybe even 15 instead of 1 it is absolutely equal errors. And I need to change all differs more than 0.5 to 1, but if error less than 0.5 then with round() it will equal to y_true. So I need to minimize "normalized" errors. I think mean function is unacceptable in my case. //Edited Just tested K.max(). It is bad idea, loss value jumpung around and model not train at all. Tested K.mean() too, It's fast coming from loss value 0.8 to ~0.2 and jump around here – Bocharick Sep 25 '17 at 18:42
  • Could you tell me exactly what kind of object is y_pred and y_true and how to access their channels/elements e.g. can you do y_pred[:,:,0] and y_pred[:,:,1] like a numpy array (i guess not but..)? (This is not easy to debug since would need to be run in a compiled CNN, so better know it beforehand) https://stackoverflow.com/questions/60582448/tensorflow-keras-custom-loss-function-access-tensor-channels – SheppLogan Mar 07 '20 at 21:53

0 Answers0