Keras custom loss function for binary encoded (not one-hot encoded) categorical data

Question

I need help writing a custom loss/metric function for Keras. My categories are binary encoded (not one-hot). I would like to do a bitwise comparison between the real classes and the predicted classes.

For example, Real-label: 0x1111111111 Predicted-label: 0x1011101111

The predicted-label has 8 of 10 bits correct so the accuracy of this match should be 0.8 not 0.0. I have no idea how I am support to do this with Keras commands.

EDIT 1: Currently I am using something like this but it is not working yet:

def custom_binary_error(y_true, y_pred, n=11):
    diff_dec = K.tf.bitwise.bitwise_xor(K.tf.cast(y_true, K.tf.int32), K.tf.cast(y_pred, K.tf.int32))
    diff_bin = K.tf.mod(K.tf.bitwise.right_shift(K.tf.expand_dims(diff_dec,1), K.tf.range(n)), 2)
    diff_sum = K.tf.math.reduce_sum(diff_bin, 1)
    diff_percent = K.tf.math.divide(diff_sum, 11)
    return K.tf.math.reduce_mean(diff_percent, 0)

I get this error:

ValueError: Dimensions must be equal, but are 2048 and 11 for 'loss/activation_1_loss/RightShift' (op: 'RightShift') with input shapes: [?,1,2048], [11].

How does your prediction look like? Is it an integer value or something else? — Anakin, Apr 19 '19 at 09:23
Such a strange encoding to me! May I ask why did you choose to design the model in such a way that it would output a real number (which later is needed to be cast to an integer)? Instead, you could model it as a multi-label classification task with a sigmoid layer as the last layer. — today, Apr 23 '19 at 14:47
So my last layer would be a dense layer with as many nodes as i have bits in my class encoding and a sigmoid activation to map the outputs to 0-1. What would my loss/accuracy measures be then? — Ryan Hope, Apr 23 '19 at 15:10
This seems relevant: https://stackoverflow.com/a/47340304/3513267 — Ryan Hope, Apr 23 '19 at 15:22
@RyanHope That's right. You need to use `binary_crossentropy` as the loss and `accuracy` as the metric (it would automatically switch to `binary_accuracy`). — today, Apr 23 '19 at 16:14

score 0 · Answer 1 · answered Apr 19 '19 at 10:02

0

I am trying something with the assumption that y_true, y_pred are positive integers.

def custom_binary_error(y_true, y_pred):
    width = y_true.bit_length() if y_true.bit_length() > y_pred.bit_length() else y_pred.bit_length()       # finds the greater width of bit sequence, not sure if needed
    diff = np.bitwise_xor(y_true, y_pred)       # 1 when different, 0 when same
    error = np.binary_repr(diff, width=width).count('1')/width       # calculate % of '1's
    return K.variable(error)

Use 1-error for accuracy. I have not tested it; this is just to give an idea.

answered Apr 19 '19 at 10:02

Anakin

1,889
1
13
27

yes y_true and t_pred would be all positive integers, I will give this a shot – Ryan Hope Apr 19 '19 at 14:42
Yeah give it a try and let me know. The idea works in Python. Have not checked in Keras. Hopefully I did not miss anything Keras specific. – Anakin Apr 19 '19 at 14:46
Did not work in Keras, I should have been more specific about y_true, and y_pred... they are both Keras Tensors (but once evaluated they should be integers). – Ryan Hope Apr 19 '19 at 15:36
Tensorflow has a bitwise_xor, if there was a way to unpack the bits and then sum them, I could divide by the max bits to get the percentage – Ryan Hope Apr 19 '19 at 16:17
Yeah, I was expecting that kind of a problem. Have to check if TF provides any similar functionality. – Anakin Apr 19 '19 at 16:57
Apparently TensorFlow's `bitwise_xor` does not have a gradient defined. – Anakin Apr 19 '19 at 17:04
I updated my main post with a modification of your function that needs some help. – Ryan Hope Apr 19 '19 at 23:50

Vlad · Answer 2 · 2019-04-29T14:02:53.100

This is how you could define your error:

import tensorflow as tf

def custom_binary_error(y_true, y_pred):
    y_true = tf.cast(y_true, tf.bool)
    y_pred = tf.cast(y_pred, tf.bool)
    xored = tf.logical_xor(y_true, y_pred)
    notxored = tf.logical_not(xored)
    sum_xored = tf.reduce_sum(tf.cast(xored, tf.float32))
    sum_notxored = tf.reduce_sum(tf.cast(notxored, tf.float32))
    return sum_xored / (sum_xored + sum_notxored)

Testing it with 2 labels of size 6:

import tensorflow as tf

y_train_size = 6

y_train = [[1, 1, 1, 1, 1, 1], [0, 0, 0, 0, 0, 0]]
y_pred = tf.convert_to_tensor([[1, 1, 1, 1, 0, 0], [0, 0, 0, 0, 1, 0]])
y = tf.placeholder(tf.int32, shape=(None, y_train_size))
error = custom_binary_error(y, y_pred)
with tf.Session() as sess:
    res = sess.run(error, feed_dict={y:y_train})
    print(res) # 0.25

Using it in Keras:

import tensorflow as tf
import numpy as np

y_train_size = 6

def custom_binary_error(y_true, y_pred):
    y_true = tf.cast(y_true, tf.bool)
    y_pred = tf.cast(y_pred, tf.bool)
    xored = tf.logical_xor(y_true, y_pred)
    notxored = tf.logical_not(xored)
    sum_xored = tf.reduce_sum(tf.cast(xored, tf.float32))
    sum_notxored = tf.reduce_sum(tf.cast(notxored, tf.float32))
    return sum_xored / (sum_xored + sum_notxored)

model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(y_train_size))

model.compile(optimizer=tf.keras.optimizers.SGD(0.01),
              loss=[tf.keras.losses.MeanAbsoluteError()],
              metrics=[custom_binary_error])

y_train = np.array([[1, 1, 1, 1, 1, 1], [0, 0, 0, 0, 0, 0]])
x_train = np.random.normal(size=(2, 2))

model.fit(x_train, y_train, epochs=2)

will result in:

Epoch 1/2
2/2 [==============================] - 0s 23ms/sample - loss: 1.4097 - custom_binary_error: 0.5000
Epoch 2/2
2/2 [==============================] - 0s 328us/sample - loss: 1.4017 - custom_binary_error: 0.5000

Note

If you want accuracy instead of error the custom_binary_error() function should return

sum_notxored / (sum_xored + sum_notxored)

Keras custom loss function for binary encoded (not one-hot encoded) categorical data

2 Answers2