4

I am modeling variables with the negative binomial distribution. Instead of prediction expected mean value, I rather would like to model two parameters of the distribution. So the output layer of my neural network is composed of two neurons. For this, I need to write a custom loss function. But code below does not work - seems to be an issue with iterating over Tensors.

How should I write the loss function with Keras (and TensorFlow) for negative binomial distribution?

I just need to rewrite this code, with code friendly to TensorFlow's tensors. According to the error I got, maybe tensorflow.map_fn may lead to a solution, but I had no luck with this.

This is working well in general, but not with Keras / Tensorflow

from scipy.stats import nbinom
from keras import backend as K
import tensorflow as tf

def loss_neg_bin(y_pred, y_true):

    result = 0.0
    for p, t in zip(y_pred, y_true):
        result += -nbinom.pmf(t, p[0], min(0.99, p[1]))

    return result

The error I got:

TypeError: Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.

nbro
  • 15,395
  • 32
  • 113
  • 196
gugatr0n1c
  • 377
  • 6
  • 21
  • 1
    have you tried to use [this](https://www.tensorflow.org/api_docs/python/tf/contrib/distributions/NegativeBinomial)? – Vlad Apr 21 '19 at 12:59
  • thx, this seems to be only other definition of the distribution, without clear way how to use it for loss function.. also deprecated, linking for tensorflow-probability lib, which seems to be promissing, but I would like to use this different way... – gugatr0n1c Apr 21 '19 at 14:14

1 Answers1

3

You need tf.map_fn to achieve loop and tf.py_func to wrap up nbinom.pmf. For example:

from scipy.stats import nbinom
import tensorflow as tf

def loss_neg_bin(y_pred, y_true):
    result = 0.0
    for p, t in zip(y_pred, y_true):
        result += -nbinom.pmf(t, p[0], min(0.99, p[1]))
    return result

y_pred= [[0.4, 0.4],[0.5, 0.5]]
y_true= [[1, 2],[1, 2]]
print('your version:\n',loss_neg_bin(y_pred, y_true))

def loss_neg_bin_tf(y_pred, y_true):
    result = tf.map_fn(lambda x:tf.py_func(lambda p,t:-nbinom.pmf(t, p[0], min(0.99,p[1]))
                                           ,x
                                           ,tf.float64)
                       ,(y_pred,y_true)
                       ,dtype=tf.float64)
    result = tf.reduce_sum(result,axis=0)
    return result

y_pred_tf = tf.placeholder(shape=(None,2),dtype=tf.float64)
y_true_tf = tf.placeholder(shape=(None,2),dtype=tf.float64)
loss = loss_neg_bin_tf(y_pred_tf, y_true_tf)

with tf.Session() as sess:
    print('tensorflow version:\n',sess.run(loss,feed_dict={y_pred_tf:y_pred,y_true_tf:y_true}))

# print
your version:
 [-0.34313146 -0.13616026]
tensorflow version:
 [-0.34313146 -0.13616026]

In addition, if you use tf.py_func to compute the probability mass function for negative binomial as a loss feedback model, you need to define the gradient function yourself.

Update --add differentiable negative binomial loss

The probability mass function for nbinom is:

nbinom.pmf(k) = choose(k+n-1, n-1) * p**n * (1-p)**k

for k >= 0 according to scipy.stats.nbinom.

So I add differentiable negative binomial loss version.

import tensorflow as tf

def nbinom_pmf_tf(x,n,p):
    coeff = tf.lgamma(n + x) - tf.lgamma(x + 1) - tf.lgamma(n)
    return tf.cast(tf.exp(coeff + n * tf.log(p) + x * tf.log(1 - p)),dtype=tf.float64)

def loss_neg_bin_tf_differentiable(y_pred, y_true):
    result = tf.map_fn(lambda x: -nbinom_pmf_tf(x[1]
                                                , x[0][0]
                                                , tf.minimum(tf.constant(0.99,dtype=tf.float64),x[0][1]))
                       ,(y_pred,y_true)
                       ,dtype=tf.float64)
    result = tf.reduce_sum(result,axis=0)
    return result

y_pred_tf = tf.placeholder(shape=(None,2),dtype=tf.float64)
y_true_tf = tf.placeholder(shape=(None,2),dtype=tf.float64)
loss = loss_neg_bin_tf_differentiable(y_pred_tf, y_true_tf)
grads = tf.gradients(loss,y_pred_tf)

y_pred= [[0.4, 0.4],[0.5, 0.5]]
y_true= [[1, 2],[1, 2]]
with tf.Session() as sess:
    print('tensorflow differentiable version:')
    loss_val,grads_val = sess.run([loss,grads],feed_dict={y_pred_tf:y_pred,y_true_tf:y_true})
    print(loss_val)
    print(grads_val)

# print
tensorflow differentiable version:
[-0.34313146 -0.13616026]
[array([[-0.42401619,  0.27393084],
       [-0.36184822,  0.37565048]])]
giser_yugang
  • 6,058
  • 4
  • 21
  • 44