I want to minimize the loss function of F1 and cross entropy, say my loss function is:
loss = a * cross_entropy_loss + (1-a) * (F1_value_loss)
Sine F1 value is non-differentiable function, and tensorflow do not supply a way to calculate F1 loss's approximate gradient, so I want to define my own gradient for F1_value_loss
at the first time to optimize whole network, and use cross_entropy_loss's
built-in gradient in tensorflow.So how can I define my own gradient for F1_value_loss
and combine it to cross_entropy_loss
, then propagate the loss to previous layers? Thx.