2

I have a Keras model that has inputs x_1,...,x_n and d-dimensional outputs f(x_1),...,f(x_n). I'm working on a regression problem with d-dimensional targets y_1,...,y_n.

I would like to minimize the loss-function: For a fixed meta-parameter a between 0 and 1, return the a^th (empirical) quantile of |f(x_i)-y_i|^2.

Here is what I have coded so far:

def keras_custom_loss(y_true,y_predicted):
    SEs = K.square(y_true-y_predicted)
    out_custom = tfp.stats.percentile(SEs, 50.0, interpolation='midpoint')
    return out_custom

One issue is that I'd like to avoid using tensorflow_probability and I would like the entire implementation done on Keras.

However, I can't figure out how.

ABIM
  • 364
  • 3
  • 19
  • Can you describe the input and output shapes properly? It's currently too ambigous. Maybe if you print the model summary? Do you want the percentile on the batch dimension or is it another dimension? – Daniel Möller Mar 16 '20 at 14:39
  • I wanted the percentile in the batch dimension. Here the output shape of my model is (None, 1) . – ABIM Mar 16 '20 at 15:01
  • Ok :) -- When you say it must be done entirely on Keras, do you mean you cannot use tensorflow at all? Or is it about "tensorflow_probability" that (I don't know) is an external package? -- Can I give you solutions using `tensorflow`? – Daniel Möller Mar 16 '20 at 15:02
  • Sure TensorFlow is okay. I just had an issue recently when coding something up on TensorFlow and then putting it into Keras (so I was abit scared :P ) – ABIM Mar 16 '20 at 15:07
  • 1
    Do you want "the value" of the percentile or "all elements inclided in that percentile"? – Daniel Möller Mar 16 '20 at 16:17
  • Ok, then I think the answer below is ok. Just notice that training this will be hard, you will be training at most two samples per batch, the others are sort of free to go. The function is not able to get the participation of the other elements in the percentile, it's taking the percentile samples punctually. – Daniel Möller Mar 16 '20 at 16:20
  • Oh I mean all values in and above that percentile. So all percentiles above the given a. – ABIM Mar 16 '20 at 16:22

2 Answers2

2

For your specific use case you can use the following function which is a simplified version of tfp.stats.percentile (they use Apache License 2.0):

import tensorflow as tf

def percentile(x, p):
    with tf.name_scope('percentile'):
        y = tf.transpose(x)  # take percentile over batch dimension
        sorted_y = tf.sort(y)
        frac_idx = tf.cast(p, tf.float64) / 100. * (tf.cast(tf.shape(y)[-1], tf.float64) - 1.)
        return 0.5 * (  # using midpoint rule
            tf.gather(sorted_y, tf.math.ceil(frac_idx), axis=-1)
            + tf.gather(sorted_y, tf.math.floor(frac_idx), axis=-1))
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • but does this average al the quantiles on and above a? – ABIM Mar 16 '20 at 16:43
  • 1
    @AnnieTheKatsu It gives the [p-th percentile](https://en.wikipedia.org/wiki/Percentile) similar to the `tfp.stats.percentile` function that you included in your question. If you want all elements above or below a certain percentile then a small modification is required, which has been realized in [the other answer](https://stackoverflow.com/a/60710000/3767239). – a_guest Mar 16 '20 at 19:15
2

For taking "all elements" above that percentile, you will need a different answer:

import keras.backend as K
from keras.layers import *
from keras.models import Model
import numpy as np
import tensorflow as tf

def above_percentile(x, p): #assuming the input is flattened: (n,)

    samples = K.cast(K.shape(x)[0], K.floatx()) #batch size
    p =  (100. - p)/100.  #100% will return 0 elements, 0% will return all elements

    #samples to get:
    samples = K.cast(tf.math.floor(p * samples), 'int32')
        #you can choose tf.math.ceil above, it depends on whether you want to
        #include or exclude one element. Suppose you you want 33% top,
        #but it's only possible to get exactly 30% or 40% top:
        #floor will get 30% top and ceil will get 40% top.
        #(exact matches included in both cases)

    #selected samples
    values, indices = tf.math.top_k(x, samples)

    return values

def custom_loss(p):
    def loss(y_true, y_predicted):
        ses = K.square(y_true-y_predicted)
        above = above_percentile(K.flatten(ses), p)
        return K.mean(above)
    return loss

Test:

dataX = np.array([2,3,1,4,7,10,8,5,6]).reshape((-1,1))
dataY = np.ones((9,1))


ins = Input((1,))
outs = Lambda(lambda x: x)(ins)
model = Model(ins, outs)

model.compile(optimizer='adam', loss = custom_loss(70.))
model.fit(dataX, dataY)

Loss will be 65 which is 130/2 (mean). And 130 = (10-1)² + (8-1)², being 10 and 8 the two top k in the input.

Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • 1
    In the source code for `tfp.stats.percentile` they have [a comment](https://github.com/tensorflow/probability/blob/5276e84a4d9c39d87d91898efaf441f877d27de1/tensorflow_probability/python/stats/quantiles.py#L528) saying that `p` should be converted to `float64` otherwise the wrong index might be computed for large arrays. By using `K.floatx()` don't you run the risk of using `float32` if that's the default float type? – a_guest Mar 16 '20 at 19:19
  • I honestly don't think you will get any problems with that. What is your batch size? (notice that it will be done batchwise). You can try float64 too, no problem. – Daniel Möller Mar 16 '20 at 20:13