I have begun to learn TensorFlow with the official guide : https://www.tensorflow.org/guide.
My comprehension is struggling with a part of the guide named "Automatic differentiation" and especially "Took gradients through a stateful object".
I don't understand why they said that stateful object stops gradient. The guide gives this piece of code
x0 = tf.Variable(3.0)
x1 = tf.Variable(0.0)
with tf.GradientTape() as tape:
# Update x1 = x1 + x0.
x1.assign_add(x0)
# The tape starts recording from x1.
y = x1**2 # y = (x1 + x0)**2
# This doesn't work.
print(tape.gradient(y, x0)) #dy/dx0 = 2*(x1 + x0)
Why the gradient doesn't record x0
?! Is it this function .assign_add(x0)
that increments x1
overshadow x0
? Is it because assign_add
will pick the value of x0
and steal its allocated memory? Is it the right reason or there is another reason that I don't see?
Thank you in advance for your answers.