I am in the process of implementing a Quasi-Newton optimizer for tensorflow, and my question is when Optimizer apply_gradients
function is called inside of the minimize
function, are the gradients applied at whatever values the tensors happen to have at that moment in time?
Cheers, Sergey