I'm new to Deep Learning. I'm studying from Udacity.
I came across one of the codes to build up a neural network, where 2 tensors are being added, specifically the 'bias' tensor with the output of the tensor-multiplication product.
It was kind of...
def activation(x):
return (1/(1+torch.exp(-x)))
inputs = images.view(images.shape[0], -1)
w1 = torch.randn(784, 256)
b1 = torch.randn(256)
h = activation(torch.mm(inputs,w1) + b1)
After flattening the MNIST, it came out as [64,784] (inputs).
I'm not getting how the bias tensor (b1) of dimension [256], could be added to the multiplication product of 'inputs' and 'w1' which comes out to be the dimensions of [256, 64].