I am new to Tensorflow and Keras. I would like to use sample weights in a custom loss function.
If I understand correctly, this post (Custom loss function with weights in Keras) suggests including weights as an input into the network. As well as this: Custom weighted loss function in Keras for weighing each element
I am wondering if I am missing something (I'd also like to not define weights as a global variable). I am also a bit surprised that there is not a way to use it directly, since the Loss class _ _ call _ _ method accepts sample_weight as an argument but if I understand correctly the loss function must have only arguments y_true, and y_pred.
From the documentation (https://keras.io/api/losses/#creating-custom-losses), however:
Creating custom losses Any callable with the signature loss_fn(y_true, y_pred) that returns an array of losses (one of sample in the input batch) can be passed to compile() as a loss. Note that sample weighting is automatically supported for any such loss.
It sounds like one should be able to use sample weighting through the model.fit(..., sample_weight=sample_weight) method.
In this post (Should the custom loss function in Keras return a single loss value for the batch or an arrary of losses for every sample in the training batch? ) there is a lengthy discussion about the size of the output of the loss function.
And, lastly, it is also mentioned that when a custom loss function is created, then, an array of losses (individual sample losses) should be returned. Their reduction is handled by the framework.
It seems to me that if the custom_loss(y_true, y_pred) returns a tensor of size (batch_size, ) then one ought to be able to use sample_weight in fit method. What am I missing?
Thanks a lot for any help!
Code snippets:
class NegLogLikMixedGaussian(Loss):
"""
Negative Log-Likelihood of Mixed Gaussian with:
num_components: number of components
mu: means of the Gaussian components
sg: standard deviations of the Gaussian components
"""
def __init__(self, num_params=NUM_PARAMS_MG,
num_components=2, name='neg_log_lik_mixed_gaussian'):
super(NegLogLikMixedGaussian, self).__init__(name=name)
self.num_params = num_params
self.num_components = num_components
def call(self, y_true, p_predict):
"""
Rem: for MDN the output of the networks are _parameters_ of the
predicted distribution, _not_ point-estimates
Parameters
----------
y_true: (batch_size, 1)
Observed value of the random variable
p_predict: (batch_size, num_components)
Output parameters of the network given some input
Returns
-------
Negative log likelihood of the batch (batch_size, 1)
"""
alpha, mu, sg = tf.split(p_predict,
num_or_size_splits=self.num_params, axis=1)
gm = tfd.MixtureSameFamily(
mixture_distribution=tfd.Categorical(probs=alpha),
components_distribution=tfd.Normal(loc=mu, scale=sg))
log_likelihood = tf.transpose(gm.log_prob(tf.transpose(y_true)))
return -tf.reduce_mean(log_likelihood, axis=-1)
My hope was then to be able to use:
model.compile(optimizer=Adam(learning_rate=0.005),
loss=NegLogLikMixedGaussian(
num_components=2, num_params=3))
And:
# For testing purposes
sample_weight = np.ones(len(y_train)) / len(dh.y_train_scaled) # this should give same results as un-weighted
# Some non-trivial weights
sample_weights = np.zeros(len(y_train))
sample_weights[:5] = 1
# This will give me same results as above
model.fit(x_train, y_train, sample_weight=sample_weight,
batch_size=128, epochs=10)