3

I'm defining a residual block in pytorch for ResNet in which you can input how many convolutional layers you want to have and not necessarily two. This is done through a parameter named nc (number of Convs). The first layer gets ni as the number of input nf number of filters. But from second layer on I put them in a for loop. Here's my code:

class ResBlock(nn.Module):
    def __init__(self, ni, nf,nc=2):
        super().__init__()
        self.conv1 = nn.Conv2d(ni,nf, kernel_size=3, stride=2, padding=0)
        self.conv2 = nn.Conv2d(nf,nf, kernel_size=3, stride=1, padding=0)
        self.conv1x1 = nn.Conv2d(ni, nf, kernel_size=1, stride=1, padding=0)
        self.nc = nc    
    def forward(self, x): 
        y = self.conv1(x)
        for i in range(self.nc-1):
            y = self.conv2(y)
            print(torch.mean(y))            
        return self.conv1x1(x) + y

But no matter what value I give to nc, it always returns 2 convs with kernel size 3. I'm not sure if for loop can really do this job in pytorch but it was working when I used functional API in Keras. Could anyone help me understand what's going on?

Hamed
  • 93
  • 6

2 Answers2

1

Yeah, printing a nn.Module object is often misleading. When you print, you get:

# for ni=3, nf=16
ResBlock(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2))
  (conv2): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1))
  (conv1x1): Conv2d(3, 16, kernel_size=(1, 1), stride=(1, 1))
)

because these are the only 3 Modules you registered in the __init__ of the ResBlock.

The actual forward can (and in your case will) be doing something completely different.

Berriel
  • 12,659
  • 4
  • 43
  • 67
  • I see! So how can I fix this? I can't use for loop then and I would have to do it manually like self.conv2(self.conv2(self.conv2(...)))? – Hamed Apr 11 '20 at 23:54
0

I faced something similar recently. I did not want to use nn.Sequential. When I print the model it did not show the operations I defined using loop in forward function. This solution worked for me.

I did not use loop in __init__ function to define the layers when defining the model. Only stored the layer in a variable then in forward function looped over it. This resulted in print(model) not showing layers added using for loop defined in forward function. Solution was to use nn.ModuleList.

This code below shows wrong results,

Code

class MyModel(nn.Module):
    def __init__(self, input_units, hidden_units, hidden_layer_count, output_units):
        super(MyModel, self).__init__()
        self.hidden_layer_count = hidden_layer_count

        self.input_layer = nn.Linear(in_features=input_units, out_features=hidden_units)
        self.hidden_layer = nn.Linear(in_features=hidden_units, out_features=hidden_units)
        self.output_layer = nn.Linear(in_features=hidden_units, out_features=output_units)

    def forward(self, item):
        x = self.input_layer(item)
        
        for i in range(self.hidden_layer_count):
            x = self.hidden_layer(x)

        output = self.output_layer(x)
        return output
my_model = MyModel(input_units=2, hidden_units=128, hidden_layer_count=7, output_units=3)
print(my_model)

Output

MyModel(
  (input_layer): Linear(in_features=2, out_features=128, bias=True)
  (hidden_layer): Linear(in_features=128, out_features=128, bias=True)
  (output_layer): Linear(in_features=128, out_features=3, bias=True)
)

This code below defines the layers in loop with nn.ModuleList and the in forward function those layer are traversed. This gives expected result.

Code

class MyModel2(nn.Module):
    def __init__(self, input_units, hidden_units, hidden_layer_count, output_units):
        super(MyModel2, self).__init__()
        self.hidden_layer_count = hidden_layer_count

        self.input_layer = nn.Linear(in_features=input_units, out_features=hidden_units)
        self.hidden_layers = nn.ModuleList([nn.Linear(
            in_features=hidden_units, out_features=hidden_units
            ) for i in range(hidden_layer_count)])
        self.output_layer = nn.Linear(in_features=hidden_units, out_features=output_units)

    def forward(self, item):
        x = self.input_layer(item)
        
        for hidden_layer in self.hidden_layers:
            x = hidden_layer(x)

        output = self.output_layer(x)
        return output
my_model2 = MyModel2(input_units=2, hidden_units=128, hidden_layer_count=7, output_units=3)
print(my_model2)

Output

MyModel2(
  (input_layer): Linear(in_features=2, out_features=128, bias=True)
  (hidden_layers): ModuleList(
    (0): Linear(in_features=128, out_features=128, bias=True)
    (1): Linear(in_features=128, out_features=128, bias=True)
    (2): Linear(in_features=128, out_features=128, bias=True)
    (3): Linear(in_features=128, out_features=128, bias=True)
    (4): Linear(in_features=128, out_features=128, bias=True)
    (5): Linear(in_features=128, out_features=128, bias=True)
    (6): Linear(in_features=128, out_features=128, bias=True)
  )
  (output_layer): Linear(in_features=128, out_features=3, bias=True)
)
B200011011
  • 3,798
  • 22
  • 33