I'd like to compute the first- and second-order derivatives of a neural network with respect to its input. As a toy model, I defined a custom layer with a method called "derivative" that does exactly this:
import torch
from torch import nn
from torch.autograd import grad
import numpy as np
class Exponential(nn.Module):
def __init__(self):
super().__init__()
self.beta = nn.Parameter(nn.init.uniform_(torch.empty(1), 0.1, 2.0), requires_grad=True)
def forward(self, x):
return torch.exp(-self.beta*x**2)
def derivative(self, X):
# define zero-vectors for first and second order derivatives
grads_1, grads_2 = torch.zeros_like(X), torch.zeros_like(X)
# loop over elements of the input vector
for i in np.arange(X.shape[0]):
x = X[i].requires_grad_(True)
y = self.forward(x).requires_grad_(True)
# compute first-order derivative
grads_1[i] = grad(y, x, create_graph=True, retain_graph=True)[0]
# compute second-order derivative
grads_2[i] = grad(grads_1[i], x, create_graph=True, retain_graph=True)[0]
return grads_1, grads_2
I compared against finite differences calculations of the derivatives and the results seem correct. However, computation is very slow! I wonder how would one go about speeding up the calculations (maybe by vectorising the loop?). Such vectorization is possible using "backward" method, e.g. see the answer to this question, however, I could not generlise this method for second-order derivatives. Thanks.