I'm writing a custom loss function in Keras and just tripped over the following:
Why do Keras loss functions have to return one scalar per batch item rather than just one scalar?
I care about the cumulative loss for the whole batch, not about the loss per item, don't I?