5

I read in a related question that keras custom loss function have to return one scalar per batch item.

I wrote a loss function that output a scalar for the whole batch and the network seems to converge. However I am not able to find any documentation on this or what exactly happens in the code. Is there a broadcasting done somewhere ? What happens if I add sample weights? Does someone has a pointer to where the magic happens ?

Thanks!

Scratch
  • 373
  • 1
  • 4
  • 15

1 Answers1

1

In general you can often use a scalar in place of a vector and this will be interpreted as a vector that is filled with this value ( e.g 1 is interpreted as 1,1,1,1 ). So if the result of your loss function for a batch is x, it is interpreted as if you were saying that loss for each item in the batch is x.

jlanik
  • 859
  • 5
  • 12
  • that's my guess but I'm looking for the exact code line where this could happen for confirmation – Scratch Aug 30 '18 at 13:26
  • The exact line in implementation of Keras (or tensorflow), where the broadcasting is implemented? – jlanik Aug 30 '18 at 13:33
  • Take a look at: https://github.com/keras-team/keras/blob/545acc4b66b5fb0ed5457ef7288d836f022c118d/keras/engine/training.py#L340, this is where the total loss is calculated. – Mark Loyman Sep 01 '18 at 19:48