5

Lets consider a simple tensor x and lets define another one which depends on x and have multiple dimension : y = (x, 2x, x^2).

How can I have the full gradient dy/dx = (1,2,x) ?

For example lets take the code :

import torch
from torch.autograd import grad

x = 2 * torch.ones(1)
x.requires_grad = True
y = torch.cat((x, 2*x, x*x))
# dy_dx = ???

This is what I have I unsuccessfuly tried so far :

>>> dy_dx = grad(y, x, grad_outputs=torch.ones_like(y), create_graph=True)
(tensor([7.], grad_fn=<AddBackward0>),)
>>> dy_dx = grad(y, x, grad_outputs=torch.Tensor([1,0,0]), create_graph=True)
(tensor([1.], grad_fn=<AddBackward0>),)
>>> dy_dx = grad(y, [x,x,x], grad_outputs=torch.eye(3), create_graph=True)
(tensor([7.], grad_fn=<AddBackward0>),)

Each times I got only part of the gradient or an accumulated version...

I know I could use a for loop using the second expression like

dy_dx = torch.zeros_like(y)
coord = torch.zeros_like(y)
for i in range (y.size(0)):
    coord[i] = 1
    dy_dx[i], = grad(y, x, grad_outputs=coord, create_graph=True)
    coord[i] = 0

However, as I am handeling with high dimensions tensors, this for loop could take too much time to compute. Moreover, there must be a way to perform the full jacobian without acuumulating the gradient...

Does anyone has the solution ? Or an alternative ?

  • 1
    When you pass `grad_outputs`, what you get back is the product between the gradient and `grad_outputs`. In your case, the gradient is `(1, 2, 2x)` with `x == 2`, so what you get back (if `grad_outputs` is `[1, 1, 1]`) is like `([1, 2, 4] * [1, 1, 1]).sum()`). I'm not aware of a way to get the individual gradients back - do you actually need them? I've always been able to use a product of the gradients and some other tensor when I've wanted gradients. What would you do next if you had the individual gradients? – Nathan Aug 06 '19 at 14:34
  • Thanks @Nathan ! Indeed, I need the full gradient because I am currently trying to solve a differential equation using a neural network. I though of other way to do it but it appears that if I had the full gradient, my code would be much performent ! – Kilian Hersent Aug 06 '19 at 14:45
  • Possible duplicate of [backward function in PyTorch](https://stackoverflow.com/questions/57248777/backward-function-in-pytorch) – Shai Aug 06 '19 at 19:27
  • Sorry @Shai but I don't think so... My question is not to know what does the argument `grad_outputs` of the `grad` function is (which is equivalent to the argument given in the `backward` function). My question is "how do I have the full Jacobian in Pytoch ?". In other words and taking the example of [backward function in Pytorch](https://stackoverflow.com/questions/57248777/backward-function-in-pytorch), how do I get the "2-by-3-by-2-by-3 output" of the `backward` function? Edit : However, as you righlty pointed out my title was ambiguous and I changed it. – Kilian Hersent Aug 06 '19 at 20:02

1 Answers1

3

torch.autograd.grad in PyTorch is aggregated. To have a vector auto-differentiated with respect to the input, use torch.autograd.functional.jacobian.

iacob
  • 20,084
  • 6
  • 92
  • 119
alxyok
  • 176
  • 6