0

I am trying to write some custom layers in Keras. The ultimate goal is that certain parameters (updated according to a fixed formula after each batch of data is optimized over in the training process) be passed to the loss function. I do not believe it is possible to use dynamic loss functions in Keras, but that I should be able to pass these parameters to the loss function using multiple inputs and a custom layer.

I want to know whether it is possible to create a layer in Keras having parameters that are not trainable (and not optimized over at all in the training process), but instead updated according to a fixed formula at the end of each batch optimization in the training process.

The simplest example I can give: instead of optimizing a generic cost function (like cross-entropy), I want to optimize something proportional to the cross entropy (c*cross_entropy). After one batch of data is processed in the training procedure, I want to set, for example, c = 1.2*c, and this to be used as the c value in the batch of data. (This should be more or less useless in this case as a positive constant times the loss function shouldn't affect the minima but it's fairly close to what I actually need to do).

Question
  • 13
  • 4
  • If your parameter is a constant value, you can just pass it to a custom loss function. As I understand from your question, those parameters do not take part in the training process. If you have parameters of the length of your training data, you can append the parameters with your label and use them in a custom loss. Have a look here if that is what you want [https://stackoverflow.com/a/55530654/8625500 ] – Anakin Apr 11 '19 at 09:18
  • Thanks for your post. I believe the example I gave was somewhat misleading. The parameters will take place in the training process. In the example of the constant multiple of the loss function above, the parameter would be multiplied by a constant at the end of each batch minimization. The second part is that each layer in my neural network will have a (non-trainable) parameter the same size as its output tensor, and then the dot product of the non-trainable parameter and the layer's output should be used in the loss function. The non-trainable parameter then needs to be updated by a formula. – Question Apr 11 '19 at 18:34
  • I understood your second part, but not the first. Can you elaborate? – Anakin Apr 11 '19 at 19:24
  • The final goal is for each layer in my network to have one non-trainable parameter associated The loss function will involve some dot products of these non-trainable parameters and the outputs of each layer. These will be summed and averaged. Then I want to multiply the averaged absolute sum of these by a scalar parameter. The parameters involved in the dot product (and the scalar) should be updated according to a fixed formula after each optimization. The "final" loss function will be a weighted product of the loss function described and a normal one with the scalar determining the weighting. – Question Apr 11 '19 at 20:10
  • Okay. I think I got a clearer picture now. One question though. Why do you need the non-trainable parameters in a Layer? Can't they just be parallely calculated as you only update them after each epoch(if I understood correctly) ? – Anakin Apr 11 '19 at 20:40
  • Yes, that's correct. The only reason I thought it might be preferable to have them in a layer is that they are used in the cost function (though fixed for the given epoch), and I thought that use in the layers may have allowed them to be passed as parameters to the cost function rather than using a new cost function and recompiling the model at each epoch. I am not sure if that is indeed important or a workable or good approach though. – Question Apr 15 '19 at 18:16
  • If that be the case, maybe have a look at my answer I referred above. You can use that technique to solve your problem. – Anakin Apr 15 '19 at 20:50

0 Answers0