20

I'm trying to create a simple weighted loss function.

Say, I have input dimensions 100 * 5, and output dimensions also 100 * 5. I also have a weight matrix of the same dimension.

Something like the following:

import numpy as np
train_X = np.random.randn(100, 5)
train_Y = np.random.randn(100, 5)*0.01 + train_X

weights = np.random.randn(*train_X.shape)

Defining the custom loss function

def custom_loss_1(y_true, y_pred):
    return K.mean(K.abs(y_true-y_pred)*weights)

Defining the model

from keras.layers import Dense, Input
from keras import Model
import keras.backend as K

input_layer = Input(shape=(5,))
out = Dense(5)(input_layer)
model = Model(input_layer, out)

Testing with existing metrics works fine

model.compile('adam','mean_absolute_error')
model.fit(train_X, train_Y, epochs=1)

Testing with our custom loss function doesn't work

model.compile('adam',custom_loss_1)
model.fit(train_X, train_Y, epochs=10)

It gives the following stack trace:

InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5] vs. [100,5]
 [[Node: loss_9/dense_8_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_9/dense_8_loss/Abs, loss_9/dense_8_loss/mul/y)]]

Where is the number 32 coming from?

Testing a loss function with weights as Keras tensors

def custom_loss_2(y_true, y_pred):
    return K.mean(K.abs(y_true-y_pred)*K.ones_like(y_true))

This function seems to do the work. So, probably suggests that a Keras tensor as a weight matrix would work. So, I created another version of the loss function.

Loss function try 3

from functools import partial

def custom_loss_3(y_true, y_pred, weights):
    return K.mean(K.abs(y_true-y_pred)*K.variable(weights, dtype=y_true.dtype))

cl3 = partial(custom_loss_3, weights=weights)  

Fitting data using cl3 gives the same error as above.

InvalidArgumentError (see above for traceback): Incompatible shapes: [32,5] vs. [100,5]
     [[Node: loss_11/dense_8_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](loss_11/dense_8_loss/Abs, loss_11/dense_8_loss/Variable/read)]]

I wonder what I'm missing! I could have used the notion of sample_weight in Keras; but then I'd have to reshape my inputs to a 3d vector.

I thought that this custom loss function should really have been trivial.

Nipun Batra
  • 11,007
  • 11
  • 52
  • 77

1 Answers1

20

In model.fit the batch size is 32 by default, that's where this number is coming from. Here's what's happening:

  • In custom_loss_1 the tensor K.abs(y_true-y_pred) has shape (batch_size=32, 5), while the numpy array weights has shape (100, 5). This is an invalid multiplication, since the dimensions don't agree and broadcasting can't be applied.

  • In custom_loss_2 this problem doesn't exist because you're multiplying 2 tensors with the same shape (batch_size=32, 5).

  • In custom_loss_3 the problem is the same as in custom_loss_1, because converting weights into a Keras variable doesn't change their shape.


UPDATE: It seems you want to give a different weight to each element in each training sample, so the weights array should have shape (100, 5) indeed. In this case, I would input your weights' array into your model and then use this tensor within the loss function:

import numpy as np
from keras.layers import Dense, Input
from keras import Model
import keras.backend as K
from functools import partial


def custom_loss_4(y_true, y_pred, weights):
    return K.mean(K.abs(y_true - y_pred) * weights)


train_X = np.random.randn(100, 5)
train_Y = np.random.randn(100, 5) * 0.01 + train_X
weights = np.random.randn(*train_X.shape)

input_layer = Input(shape=(5,))
weights_tensor = Input(shape=(5,))
out = Dense(5)(input_layer)
cl4 = partial(custom_loss_4, weights=weights_tensor)
model = Model([input_layer, weights_tensor], out)
model.compile('adam', cl4)
model.fit(x=[train_X, weights], y=train_Y, epochs=10)
rvinas
  • 11,824
  • 36
  • 58
  • 1
    Thanks. So, `model.compile('adam',custom_loss_1) model.fit(train_X, train_Y, epochs=10, batch_size=len(train_Y))` works. Would it be possible to correctly use the corresponding batch from the weight matrix? – Nipun Batra Jan 03 '18 at 18:19
  • I'm not entirely sure about the purpose of this weighted loss. Do you want to have a different weight for each training example, or rather a different weight for each "class"? – rvinas Jan 03 '18 at 18:22
  • I'm trying to use a different weight for each "class". However, my outputs are real-valued variables. A lot of them are zeros. A. very few of them are say, greater than 10. So, it's a case of imbalance, which I'm trying to cover using weighing. Something like - give lower weight to 0s and higher weight to anything more than zero. Without weighting, I end up predicting zeros! – Nipun Batra Jan 03 '18 at 18:26
  • In that case, you should only have a different weight for each class (i.e. only 5 weights, not 100*5 weights). Your code will work if you define weights as: `weights = np.random.randn(5)` – rvinas Jan 03 '18 at 18:29
  • Actually, the notion of 100*5 weights comes from the fact that different samples (the 100 dimension) can have different amount of zero and non-zero values. Think of it in this way that given a 100*5 matrix with a lot of zeros, I want to give higher weight to non-zeros, which can occur anywhere in the matrix. – Nipun Batra Jan 03 '18 at 18:37
  • Ok. Please correct me if I'm wrong: if `train_Y[i, :]` is an array with 5 zeros, then you want to give sample `i` a low weight, right? – rvinas Jan 03 '18 at 18:44
  • 1
    Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/162447/discussion-between-nipun-batra-and-rvinas). – Nipun Batra Jan 03 '18 at 18:45
  • See also: https://stackoverflow.com/a/50127646/1447257 custom losses taking multiple inputs can be added with `Model.add_loss` – Philipp H. Nov 11 '18 at 23:12
  • 1
    Thanks for the code. But I got this error: https://i.postimg.cc/6qwP5Bsr/error.png . I also asked the same question: https://stackoverflow.com/questions/74425890/how-to-define-specific-coefficients-for-each-input-feature-to-increase-and-decre. I need to find the solution. I don't understand why a person commented and denied it. – Aref Hemati Nov 15 '22 at 20:25
  • 1
    This error: TypeError: You are passing KerasTensor(type_spec=TensorSpec(shape=(), dtype=tf.float32, name=None), name='Placeholder:0', description="created by layer 'tf.cast_3'"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as `tf.cond`, `tf.function`, gradient tapes, or `tf.map_fn`. Keras Functional model construction only supports TF API calls that *do* support dispatching, such as `tf.math.add` or `tf.reshape`. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by ... – Aref Hemati Nov 15 '22 at 20:29
  • 1
    ...You can work around this limitation by putting the operation in a custom Keras layer `call` and calling that layer on this symbolic input/output. – Aref Hemati Nov 15 '22 at 20:30