3
input = torch.randn(8, 3, 50, 100)

m = nn.Conv2d(3, 3, kernel_size=(3, 3), padding=(1, 1))
m2 = nn.Conv2d(3, 3, kernel_size=3, padding=1)

output = m(input)
output2 = m2(input)

torch.equal(output, output2) >> False

I suppose above m and m2 Conv2d should have exactly same output value but practically not, what is the reason?

deephao
  • 35
  • 4

2 Answers2

3

You have initialized two nn.Conv2d with identical settings, that's true. Initialization of the weights however is done randomly! You have here two different layers m and m2. Namely m.weight and m2.weigth have different components, same goes for m.bias and m2.bias.

One way to have get the same results, is to copy the underlying parameters of the model:

>>> m.weight = m2.weight
>>> m.bias = m2.bias

Which, of course, results in torch.equal(m(input), m2(input)) being True.

Ivan
  • 34,531
  • 8
  • 55
  • 100
2

The "problem" here isn't related to int vs tuple. In fact, if you print m and m2 you'll see

>>> m
Conv2d(3, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
>>> m2
Conv2d(3, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

that the integer got expanded as the documentation promises.

What actually differs is the initial weights, which I believe are just random. You can view them via m.weights, m2.weights. These will differ every time you create a new Conv2d, even if you use the same arguments.

You can initialize the weights if you want to play around with these objects in a predictable way, see How to initialize weights in PyTorch? e.g.

m.weight.data.fill_(0.01)
m2.weight.data.fill_(0.01)
m.bias.data.fill_(0.1)
m2.bias.data.fill_(0.1)

and they should now be identical.

Mikael Öhman
  • 2,294
  • 15
  • 21