I'm trying to implement the following custom loss function from this SO post; however, I've had to make some minor changes to suit my model. For some context, I'm using multi labels with 5 classes (below is an example of how they're encoded).
0 => [1, 0, 0, 0, 0]
1 => [1, 1, 0, 0, 0]
2 => [1, 1, 1, 0, 0]
3 => [1, 1, 1, 1, 0]
4 => [1, 1, 1, 1, 1]
My custom loss function
def _cohen_kappa(y_true, y_pred, num_classes=5, weights=None, metrics_collections=None, updates_collections=None, name=None):
kappa, update_op = tf.contrib.metrics.cohen_kappa(y_true, y_pred, num_classes, weights, metrics_collections, updates_collections, name)
kappa = K.cast(kappa, 'float32')
K.get_session().run(tf.local_variables_initializer())
with tf.control_dependencies([update_op]):
kappa = tf.identity(kappa)
return kappa
def cohen_kappa_loss(num_classes=5, weights=None, metrics_collections=None, updates_collections=None, name=None):
def cohen_kappa(y_true, y_pred):
y_true = K.cast(y_true, 'int32')
y_pred = K.cast(y_pred + 0.5, 'int32')
y_true = tf.subtract(K.sum(y_true, axis=1), tf.constant(1))
y_pred = tf.subtract(K.sum(y_pred, axis=1), tf.constant(1))
return -_cohen_kappa(y_true, y_pred, num_classes, weights, metrics_collections, updates_collections, name)
return cohen_kappa
This is how I'm attempting to use my loss function:
model_cohen_kappa = cohen_kappa_loss(num_classes=5)
model.compile(loss=model_cohen_kappa,
optimizer=optimizers.SGD(lr=0.0001, momentum=0.9),
metrics=['accuracy'])
Unfortunately, I get the following error, which is confusing since my loss function doesn't contain K.argmax, K.round, K.eval.
, which are mentioned in the error message as operations that are non-differentiable. Is there another non-differentiable operation in my custom loss function that I'm not noticing that is giving me this error?
Traceback (most recent call last):
File "small_test.py", line 106, in <module>
main()
File "small_test.py", line 101, in main
max_queue_size=2
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training_generator.py", line 40, in fit_generator
model._make_train_function()
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training.py", line 509, in _make_train_function
loss=self.total_loss)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\optimizers.py", line 184, in get_updates
grads = self.get_gradients(loss, params)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\optimizers.py", line 91, in get_gradients
raise ValueError('An operation has `None` for gradient. '
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
While I suspect K.cast
is non-differentiable, removing the below snippet from my loss function results in the following error:
kappa = K.cast(kappa, 'float32')
Error
Traceback (most recent call last):
File "small_test.py", line 106, in <module>
main()
File "small_test.py", line 91, in main
metrics=['accuracy'])
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training.py", line 342, in compile
sample_weight, mask)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\keras\engine\training_utils.py", line 421, in weighted
score_array *= weights
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\ops\math_ops.py", line 884, in binary_op_wrapper
return func(x, y, name=name)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1180, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6879, in mul
"Mul", x=x, y=y, name=name)
File "C:\Users\Anaconda3\envs\tensor\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 563, in _apply_op_helper
inferred_from[input_arg.type_attr]))
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float64 of argument 'x'.