Do keras loss have to output one scalar per batch or one scalar for the whole batch ?

Question

I read in a related question that keras custom loss function have to return one scalar per batch item.

I wrote a loss function that output a scalar for the whole batch and the network seems to converge. However I am not able to find any documentation on this or what exactly happens in the code. Is there a broadcasting done somewhere ? What happens if I add sample weights? Does someone has a pointer to where the magic happens ?

Thanks!

would be helpful for analysis to see your model code :-) – Cut7er Aug 30 '18 at 08:27 — Cut7er, Aug 30 '18 at 08:27

score 1 · Accepted Answer · answered Aug 30 '18 at 08:47

1

In general you can often use a scalar in place of a vector and this will be interpreted as a vector that is filled with this value ( e.g 1 is interpreted as 1,1,1,1 ). So if the result of your loss function for a batch is x, it is interpreted as if you were saying that loss for each item in the batch is x.

answered Aug 30 '18 at 08:47

jlanik

859
5
12

that's my guess but I'm looking for the exact code line where this could happen for confirmation – Scratch Aug 30 '18 at 13:26
The exact line in implementation of Keras (or tensorflow), where the broadcasting is implemented? – jlanik Aug 30 '18 at 13:33
Take a look at: https://github.com/keras-team/keras/blob/545acc4b66b5fb0ed5457ef7288d836f022c118d/keras/engine/training.py#L340, this is where the total loss is calculated. – Mark Loyman Sep 01 '18 at 19:48

Do keras loss have to output one scalar per batch or one scalar for the whole batch ?

1 Answers1