0

I have a model denoted by f(),

Suppose the target is t, f(x1) = y1 and f(x2) = y2 and my loss is defined as
loss = mse(y1,y2) + mse(y2,t)

Since both y1 and y2 reguires grad, I have received error such as

one of the variables needed for gradient computation has been modified by an inplace operation

My understanding is that, suppose I evaluate y1 first, the graph has been changed upon my evaluation of y2. Should I fix some tensor such as ,e.g., y1_no_grad = y1.detach().numpy() and then use loss = mse(y1_no_grad,y2) + mse(y2,t)?
However, I still receive error Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy(), which I am not sure if it is because y1_no_grad is a numpy array while y2 is a tensor.

Update:

I realized my problem afterwards. It was due to that I created multiple loss tensors and that I backwarded one loss tensor first which changed the parameters in-place. This caused error when I wanted to backward another loss tensor.

E.g.

f(x1) = y1
f(x2) = y2
f(x3) = y3 
...
f(xn) = yn

f(x) = y

for i in range(n):
    optimizer.zero_grad()
    loss = mse(y,yi) + mse(yi,t)
    loss.backward()
    optimizer.step()

To me the solutions are either:

1.accumulate the loss tensors before doing backward , i.e.

for i in range(n):
    loss = mse(y,yi) + mse(yi,t)
    loss.backward()
optimizer.step()

2.Evaluate again before each backward, i.e.:

for i in range(n):
    optimizer.zero_grad()
    y = f(x)
    yi = f(xi)
    loss = mse(y,yi) + mse(yi,t)
    loss.backward()
    optimizer.step() 
loct
  • 357
  • 2
  • 5
  • 21
  • I suppose this one might help https://stackoverflow.com/questions/55466298/pytorch-cant-call-numpy-on-variable-that-requires-grad-use-var-detach-num – nikhil6041 Apr 17 '21 at 05:19

1 Answers1

1

Suppose the target is t, f(x1) = y1 and f(x2) = y2 and my loss is defined as loss = mse(y1,y2) + mse(y2,t)

Since both y1 and y2 reguires grad, I have received error such as

This statement is incorrect. What you've described does not necessiate in-place assignment errors. For example

import torch
from torch.nn.functional import mse_loss

def f(x):
    return x**2

t = torch.ones(1)

x1 = torch.randn(1, requires_grad=True)
x2 = torch.randn(1, requires_grad=True)

y1 = f(x1)
y2 = f(x2)

loss = mse_loss(y1, y2) + mse_loss(y2, t)
loss.backward()

does not produce any errors. Likely your issue is somewhere else.

For the general case you described you should get a computation graph that could be visualized as

enter image description here

The only issue here could be that your function f is not differentiable or is somehow invalid (perhaps in in-place assignment is taking place in f).

jodag
  • 19,885
  • 5
  • 47
  • 66