How torch.Tensor.backward() works?

Question

I recently study Pytorch and backward function of the package. I understood how to use it, but when I try

x = Variable(2*torch.ones(2, 2), requires_grad=True)
x.backward(x)
print(x.grad)

I expect

tensor([[1., 1.],
        [1., 1.]])

because it is an identity function. However, it returns

tensor([[2., 2.],
        [2., 2.]]).

Why this happens?

Anubhav Singh · Accepted Answer · 2019-06-28T12:49:49.723

1

Actually, this is what you are looking for:

Case 1: when z = 2*x**3 + x

import torch
from torch.autograd import Variable
x = Variable(2*torch.ones(2, 2), requires_grad=True)
z = x*x*x*2+x
z.backward(torch.ones_like(z))
print(x.grad)

output:

tensor([[25., 25.],
        [25., 25.]])

Case 2: when z = x*x

x = Variable(2*torch.ones(2, 2), requires_grad=True)
z = x*x
z.backward(torch.ones_like(z))
print(x.grad)

output:

tensor([[4., 4.],
        [4., 4.]])

Case 3: when z = x (your case)

x = Variable(2*torch.ones(2, 2), requires_grad=True)
z = x
z.backward(torch.ones_like(z))
print(x.grad)

output:

tensor([[1., 1.],
        [1., 1.]])

To learn more how to calculate gradient in pytorch, check this.

edited Jun 28 '19 at 12:49

answered Jun 28 '19 at 12:28

Anubhav Singh

8,321
4
25
43

So I guess shape of tensor inside backward() specifies the shape of outputs. What is the role of is size? Just a mulplication? – CSH Jun 28 '19 at 13:32
That is equivalent to `dy_dx = grad(outputs=y, inputs=x, grad_outputs=torch.ones_like(y))` – Anubhav Singh Jun 28 '19 at 13:33
What you are passing above is x which is tensor([[2., 2.], [2., 2.]]) – Anubhav Singh Jun 28 '19 at 13:35

score 0 · Answer 2 · answered Dec 01 '20 at 06:47

I think you misunderstand how to use tensor.backward(). The parameter inside the backward() is not the x of dy/dx.

For example, if y is got from x by some operation, then y.backward(w), firstly pytorch will get l = dot(y,w), then calculate the dl/dx. So for your code, l = 2x is calculated by pytorch firstly, then dl/dx is what your code returns.

When you do y.backward(w), Just make the parameter of backward() full of 1s if y is not a scalar; otherwise just no parameter.

What is `dot`? torch.dot only supports 1d tesnor – liang Oct 20 '21 at 15:07 — liang, Oct 20 '21 at 15:07

How torch.Tensor.backward() works?

2 Answers2