Given a pytorch
input dataset with dimensions:
dat.shape = torch.Size([128, 3, 64, 64])
This is a supervised learning problem: we have a separate labels.txt
file containing one of C
classes for each input observation. The value of C
is calculated by the number of distinct values in the labeles file and is presently in the single digits.
I could use assistance on how to mesh the layers of a simple mix of convolutional and linear layers network that is performing multiclass classification. The intent is to pass through:
- two cnn layers with maxpooling after each
- a linear "readout" layer
- softmax activation before the output/labels
Here is the core of my (faulty/broken) network. I am unable to determine the proper size/shape required of:
Output of Convolutional layer -> Input of Linear [Readout] layer
class CNNClassifier(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 16, 3)
self.maxpool = nn.MaxPool2d(kernel_size=3,padding=1)
self.conv2 = nn.Conv2d(16, 32, 3)
self.linear1 = nn.Linear(32*16*16, C)
self.softmax1 = nn.LogSoftmax(dim=1)
def forward(self, x):
x = self.conv1(x)
x = self.maxpool(F.leaky_relu(x))
x = self.conv2(x)
x = self.maxpool(F.leaky_relu(x))
x = self.linear1(x) # Size mismatch error HERE
x = self.softmax1(x)
return x
Training of the model is started by :
Xout = model(dat)
This results in :
RuntimeError: size mismatch, m1: [128 x 1568], m2: [8192 x 6]
at the linear1
input. What is needed here ? Note I have seen uses of wildcard input sizes e.g via a view
:
..
x = x.view(x.size(0), -1)
x = self.linear1(x) # Size mismatch error HERE
If that is included then the error changes to
RuntimeError: size mismatch, m1: [28672 x 7], m2: [8192 x 6]
Some pointers on how to think about and calculate the cnn layer / linear layer input/output sizes would be much appreciated.