12

When building a simple perceptron neural network we usuall passes a 2D matrix of input of format (batch_size,features) to a 2D weight matrix, similar to this simple neural network in numpy. I always assumed a Perceptron/Dense/Linear layer of a neural network only accepts an input of 2D format and outputs another 2D output. But recently I came across this pytorch model in which a Linear layer accepts a 3D input tensor and output another 3D tensor (o1 = self.a1(x)).

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.a1 = nn.Linear(4,4)
        self.a2 = nn.Linear(4,4)
        self.a3 = nn.Linear(9,1)
    def forward(self,x):
        o1 = self.a1(x)
        o2 = self.a2(x).transpose(1,2)
        output = torch.bmm(o1,o2)
        output = output.view(len(x),9)
        output = self.a3(output)
        return output

x = torch.randn(10,3,4)
y = torch.ones(10,1)

net = Net()

criterion = nn.MSELoss()
optimizer = optim.Adam(net.parameters())

for i in range(10):
    net.zero_grad()
    output = net(x)
    loss = criterion(output,y)
    loss.backward()
    optimizer.step()
    print(loss.item())

These are the question I have,

  1. Is the above neural network a valid one? that is whether the model will train correctly?
  2. Even after passing a 3D input x = torch.randn(10,3,4), why is the pytorch nn.Linear doesn't shows any error and gives a 3D output?
Eka
  • 14,170
  • 38
  • 128
  • 212

2 Answers2

22

Newer versions of PyTorch allows nn.Linear to accept N-D input tensor, the only constraint is that the last dimension of the input tensor will equal in_features of the linear layer. The linear transformation is then applied on the last dimension of the tensor.
For instance, if in_features=5 and out_features=10 and the input tensor x has dimensions 2-3-5, then the output tensor will have dimensions 2-3-10.

Shai
  • 111,146
  • 38
  • 238
  • 371
11

If you have a look at the documentation, you will find that indeed the Linear layer accepts tensors of arbitrary shape, where only the last dimension must match with the in_features argument you specified in the constructor.

The output will have exactly the same shape as the input, only the last dimension will change to whatever you specified as out_features in the constructor.

It works in a way that the same layer (with the same weights) is applied on each of the (possibly) multiple inputs. In your example you have an input shape of (10, 3, 4) which is basically a set of 10 * 3 == 30 4-dimensional vectors. So, your layers a1 and a2 are applied on all of these 30 vectors to generate another 10 * 3 == 30 4D vectors as the output (because you specified out_features=4 in the constructor).

So, to answer your questions:

Is the above neural network a valid one? that is whether the model will train correctly?

Yes, it is valid and it will be trained "correctly" from a technical pov. But, as with any other network, if this will actually correctly tackle your problem is another question.

Even after passing a 3D input x = torch.randn(10,3,4), why is the pytorch nn.Linear doesn't shows any error and gives a 3D output?

Well, because it is defined to work this way.

sebrockm
  • 5,733
  • 2
  • 16
  • 39