I'm having trouble understanding how to execute backward propagation of Leaky ReLU.
I have read other posts, and I'm still not quite sure I understand because of a lack of notation (not sure what is what).
If I have dA, or the activation of the current layer, and a cached value Z from forward propagation is this the correct implementation:
def leaky_relu_backward(dA, cache):
"""
The backward propagation for a single leaky RELU unit.
Arguments:
dA - post-activation gradient
cache - 'Z' where we store for computing backward propagation efficiently
Returns:
dZ - Gradient of the cost with respect to Z
"""
Z = cache
# just converting dz to a correct object.
dZ = np.array(dA, copy=True)
# When z <= 0, we should set dz to .01
dZ[Z <= 0] = .01
return dZ
Or is there more too it? In this post: How to implement the derivative of Leaky Relu in python? The answer shows a multiplication happening on the return statement. Not sure if I need that or not.