TL;DR: How can I modify my code given below to incorporate the padding = 'same'
method?
I was trying to build my own CNN
using numpy
and got confused due to the two answers for padding = 'same'
.
This Answer says that
padding='Same' in Keras means padding is added as required to make up for overlaps when the input size and kernel size do not perfectly fit
So according to this, same
means the Minumum padding required in each direction. If that's the case, shouldn't this be equally on both sides? Or if the minimum required padding was 2, shouldn't that be a valid candidate for padding to be distributed equally on all of the 4 sides. What if required padding was just 3? What happens then?
Also, what bothers me is the official documentation of tensorflow where they say:
"same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.
So what is the right answer?
Here is the code that I have written for padding
def add_padding(X:np.ndarray, pad_size:Union[int,list,tuple], pad_val:int=0)->np.ndarray:
'''
Pad the input image array equally from all sides
args:
x: Input Image should be in the form of [Batch, Width, Height, Channels]
pad_size: How much padding should be done. If int, equal padding will done. Else specify how much to pad each side (height_pad,width_pad) OR (y_pad, x_pad)
pad_val: What should be the value to be padded. Usually it os 0 padding
return:
Padded Numpy array Image
'''
assert (len(X.shape) == 4), "Input image should be form of [Batch, Width, Height, Channels]"
if isinstance(pad_size,int):
y_pad = x_pad = pad_size
else:
y_pad = pad_size[0]
x_pad = pad_size[1]
pad_width = ((0,0), (y_pad,y_pad), (x_pad,x_pad), (0,0)) # Do not pad first and last axis. Pad Width(2nd), Height(3rd) axis with pad_size
return np.pad(X, pad_width = pad_width, mode = 'constant', constant_values = (pad_val,pad_val))
# Another part of my Layer
# New Height/Width is dependent on the old height/ width, stride, filter size, and amount of padding
h_new = int((h_old + (2 * padding_size) - filter_size) / self.stride) + 1
w_new = int((w_old + (2 * padding_size) - filter_size) / self.stride) + 1