1

So, similar to this question: How to update model parameters with accumulated gradients?

I have a large network, and a very small batch size. To combat this I want to accumulate gradients (multiple forward passes) and then apply the update of the parameters using the mean gradient.

However, my network has BN layers. How should I handle this?

Dammi
  • 1,268
  • 2
  • 13
  • 23

0 Answers0