How do I combine tf.absolute and tf.square to create the Huber loss function in Tensorflow?

Question

To be precise, the loss function that I'm looking for is the squared error when the absolute error is lesser than 0.5, and it is the absolute error itself, when the absolute error is greater than 0.5. In this way, the gradient from the error function doesn't exceed 1 because once the gradient of the squared error function reaches 1, the absolute error function kicks in, and the gradient remains constant at 1. I've included my current implementation below. For some reason, it's giving me worse performance than just the squared error.

fn_choice_maker1 = (tf.to_int32(tf.sign(y - y_ + 0.5)) + 1)/2
fn_choice_maker2 = (tf.to_int32(tf.sign(y_ - y + 0.5)) + 1)/2
choice_maker_sqr = tf.to_float(tf.mul(fn_choice_maker1,   fn_choice_maker2))

sqr_contrib = tf.mul(choice_maker_sqr, tf.square(y - y_))
abs_contrib = tf.abs(y - y_)-0.25 - tf.mul(choice_maker_sqr, tf.abs(y - y_)-0.25)
loss = tf.reduce_mean(sqr_contrib + abs_contrib)
train_step = tf.train.AdamOptimizer(1e-4).minimize(loss)

choice_maker_sqr is a column tensor that is one whenever the error is between 0.5 and -0.5. The names are pretty self explanatory.

When you are saying 'it's giving me worse performance', are you talking about the speed of running a step, or your model's learning performance? The former is expected because there are many more ops when you compute the Huber loss vs. just the squared loss. If it's the latter, then are you really asking about the efficacy of using the Huber loss for your problem? In that case, it might help if you include some more details of your model. — keveman, Aug 23 '16 at 18:15
It's the latter. This is for a reinforcement learning model, and is related to [this stack overflow question](http://stackoverflow.com/questions/36462962/loss-clipping-in-tensor-flow-on-deepminds-dqn). I'm actually more interested in knowing if my implementation of the Huber loss is wrong in any way (in tensorflow). — Eddy, Aug 23 '16 at 18:33
`tf.cond(tf.abs(y-y_) < 0.5, lambda: tf.square(y-y_), lambda: tf.abs(y-y_))` would be a more straightforward implementation of the description. — keveman, Aug 23 '16 at 20:39
The function you describe has a discontinuity at `|error| = 0.5`. A correct Huber loss would be `tf.cond(tf.abs(error) < 0.5, lambda: tf.square(error), lambda: tf.abs(error) - 0.25)`. — Alex, Mar 15 '17 at 23:02
To implement this for vectors, you can use `tf.where`: `tf.where(tf.abs(error) < 0.5, tf.square(error), tf.abs(error) - 0.25)` — Alex, Mar 16 '17 at 05:17

score 5 · Answer 1 · answered Mar 23 '17 at 19:48

Here is my implementation of the Huber loss function in python tensorflow:

def huber_loss(y_true, y_pred, max_grad=1.):
    """Calculates the huber loss.

    Parameters
    ----------
    y_true: np.array, tf.Tensor
      Target value.
    y_pred: np.array, tf.Tensor
      Predicted value.
    max_grad: float, optional
      Positive floating point value. Represents the maximum possible
      gradient magnitude.

    Returns
    -------
    tf.Tensor
      The huber loss.
    """
    err = tf.abs(y_true - y_pred, name='abs')
    mg = tf.constant(max_grad, name='max_grad')
    lin = mg*(err-.5*mg)
    quad=.5*err*err
    return tf.where(err < mg, quad, lin)

score 1 · Answer 2 · answered Jun 01 '17 at 09:30

1

You can use tf.select to implement it in a single call:

err = y - y_
huber_loss = tf.select(tf.abs(err) < 1.0,   
                       0.5 * tf.square(err), 
                       tf.abs(err) - 0.5) # if, then, else

answered Jun 01 '17 at 09:30

runDOSrun

10,359
7
47
57

score 1 · Answer 3 · answered Jul 04 '17 at 09:15

1

err = tf.subtract(x,y)
huber_loss = tf.where(tf.less(x,y),   
                       tf.sqrt(tf.square(err)), 
                       tf.abs(err))
with tf.Session() as sess:
    print(sess.run(tf.reduce_mean(huber_loss)))

answered Jul 04 '17 at 09:15

raghu nanden

37
2

2

Please provide a detail explanation so that it is easily understandable for users in future. – Vivz Jul 04 '17 at 09:35

Eshmeister · Answer 4 · 2018-05-31T12:55:00.453

0

Not sure if this is still relevant, but I would like to point it out to those seeking this in the future. The tensorflow research losses script has an implementation of the Huber loss for Object detection (like its implemented in the FasterRCNN paper)

Here's the link to the method

edited May 31 '18 at 12:55

answered May 31 '18 at 12:33

Eshmeister

59
9

How do I combine tf.absolute and tf.square to create the Huber loss function in Tensorflow?

4 Answers4