If you're trying to implement a Conv2d on an RGB image this implementation in pytorch should help.
Grab an image and make it a numpy ndarray of uint8 (note that imshow needs uint8 to be values between 0-255 whilst floats should be between 0-1):
link = 'https://oldmooresalmanac.com/wp-content/uploads/2017/11/cow-2896329_960_720-Copy-476x459.jpg'
r = requests.get(link, timeout=7)
im = Image.open(BytesIO(r.content))
pic = np.array(im)
You can view it with
f, axarr = plt.subplots()
axarr.imshow(pic)
plt.show()
Create your convolution layer (initiates with random weights)
conv_layer = nn.Conv2d(in_channels=3,
out_channels=3,kernel_size=3,
stride=1, bias=None)
Convert input image to float and add an empty dimension because that is the input pytorch expects
pic_float = np.float32(pic)
pic_float = np.expand_dims(pic_float,axis=0)
Run the image through the convolution layer (permute changes around the dimension location so they match what pytorch is expecting)
out = conv_layer(torch.tensor(pic_float).permute(0,3,1,2))
Remove the extra first dim we added (not needed for visualization), detach from GPU and convert to numpy ndarray
out = out.permute(0,2,3,1).detach().numpy()[0, :, :, :]
Visualise the output (with a cast to uint8 which is what we started with)
f, axarr = plt.subplots()
axarr.imshow(np.uint8(out))
plt.show()
You can then change the weights of the filters by accessing them. For example:
kernel = torch.Tensor([[[[0.01, 0.02, 0.01],
[0.02, 0.04, 0.02],
[0.01, 0.02, 0.01]]]])
kernel = kernel.repeat(3, 3, 1, 1)
conv_layer.weight.data = kernel