I have a model denoted by f()
,
Suppose the target is t
, f(x1) = y1
and f(x2) = y2
and my loss is defined as
loss = mse(y1,y2) + mse(y2,t)
Since both y1
and y2
reguires grad, I have received error such as
one of the variables needed for gradient computation has been modified by an inplace operation
My understanding is that, suppose I evaluate y1 first, the graph has been changed upon my evaluation of y2. Should I fix some tensor such as ,e.g., y1_no_grad = y1.detach().numpy()
and then use
loss = mse(y1_no_grad,y2) + mse(y2,t)
?
However, I still receive error Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy()
, which I am not sure if it is because y1_no_grad is a numpy array while y2 is a tensor.
Update:
I realized my problem afterwards. It was due to that I created multiple loss tensors and that I backwarded one loss tensor first which changed the parameters in-place. This caused error when I wanted to backward another loss tensor.
E.g.
f(x1) = y1
f(x2) = y2
f(x3) = y3
...
f(xn) = yn
f(x) = y
for i in range(n):
optimizer.zero_grad()
loss = mse(y,yi) + mse(yi,t)
loss.backward()
optimizer.step()
To me the solutions are either:
1.accumulate the loss tensors before doing backward , i.e.
for i in range(n):
loss = mse(y,yi) + mse(yi,t)
loss.backward()
optimizer.step()
2.Evaluate again before each backward, i.e.:
for i in range(n):
optimizer.zero_grad()
y = f(x)
yi = f(xi)
loss = mse(y,yi) + mse(yi,t)
loss.backward()
optimizer.step()