432

What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?

In my opinion, 'VALID' means there will be no zero padding outside the edges when we do max pool.

According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use 'VALID' of tensorflow. But what is 'SAME' padding of max pool in tensorflow?

Kayathiri
  • 779
  • 1
  • 15
  • 26
karl_TUM
  • 5,769
  • 10
  • 24
  • 41
  • 5
    Check https://www.tensorflow.org/api_guides/python/nn#Notes_on_SAME_Convolution_Padding for details, this is how tf done it. – GabrielChu Oct 09 '18 at 20:48
  • 9
    Here's a pretty [detailed answer with visualizations](https://www.corvil.com/kb/what-is-the-difference-between-same-and-valid-padding-in-tf-nn-max-pool-of-tensorflow). – rbinnun Nov 02 '18 at 14:44
  • 6
    Check out these amazing gifs to understand how padding and stride works. [Link](https://github.com/vdumoulin/conv_arithmetic/tree/master/gif) – Deepak Feb 07 '19 at 23:18
  • 4
    @GabrielChu your link appears to have died and is now a redirect to a general overview. – matt Dec 03 '19 at 07:03
  • 1
    As Tensorflow upgrading to 2.0, things will be replaced by Keras and I believe you can find the pooling information in Keras documentations. @matt – GabrielChu Dec 06 '19 at 03:17
  • How could someone vote this post past 420 – Ari Nov 18 '22 at 18:00

16 Answers16

764

If you like ascii art:

  • "VALID" = without padding:

       inputs:         1  2  3  4  5  6  7  8  9  10 11 (12 13)
                      |________________|                dropped
                                     |_________________|
    
  • "SAME" = with zero padding:

                   pad|                                      |pad
       inputs:      0 |1  2  3  4  5  6  7  8  9  10 11 12 13|0  0
                   |________________|
                                  |_________________|
                                                 |________________|
    

In this example:

  • Input width = 13
  • Filter width = 6
  • Stride = 5

Notes:

  • "VALID" only ever drops the right-most columns (or bottom-most rows).
  • "SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).

Edit:

About the name:

  • With "SAME" padding, if you use a stride of 1, the layer's outputs will have the same spatial dimensions as its inputs.
  • With "VALID" padding, there's no "made-up" padding inputs. The layer only uses valid input data.
MiniQuark
  • 46,633
  • 36
  • 147
  • 183
  • 6
    Is it fair to say "SAME" means "use zero-padding to make sure the filter size doesn't have to change if the image width is not a multiple of the filter width or the image height is not a multiple of the filter height"? As in, "pad with zeros up to a multiple of the filter width" if width is the problem? – StatsSorceress Nov 30 '17 at 17:46
  • 6
    Answering my own side question: NO, that's not the point of zero padding. You choose the filter size to work with the input (including zero padding), but you don't choose the zero padding after the filter size. – StatsSorceress Dec 02 '17 at 01:20
  • 2
    I don't understand your own answer @StatsSorceress . It seems to me that you add enough zeros (in a as symmetric as possible way) so that all inputs are covered by some filter, am I right? – guillefix Jan 11 '19 at 17:25
  • 4
    Great answer, just to add: In case that the tensor values can be negative, padding for max_pooling is with `-inf`. – Tones29 Apr 10 '19 at 12:46
  • 1
    What if the input width is an even number when ksize=2, stride=2 and with SAME padding ?...then it should not be zero-padded right ?....I'm saying this when I look darkflow code repo, they are using SAME pad, stride=2,ksize=2 for maxpool....after maxpooling image width reduced down to 208 pixels from 416 pixel width. Can anyone clarify this ? – Kavindu Vindika Apr 04 '20 at 21:01
  • 1
    @MiniQuark Shouldn't the padding for "same" be 1 more both on the left and right side? As it is, with a stride of 1, I would expect the output to be of length 11. With two more added zeros, we would get 13, which equals the input length – Christopher Jun 30 '20 at 13:03
  • 1
    Ok, that was wrong - if the sequence is padded with 2 more zeros, there are 2 positions left after applying the kernel 3x. So I guess my question becomes how to decide how much padding is needed. Is it `if (stride==1): pad s.t. output length = input length; else: pad s.t. whole kernels can be applied`? So 5 for stride=1, 1 for stride=2, 2 for stride=3, ...? – Christopher Jun 30 '20 at 13:12
  • 1
    And to answer my question myself, I found this helpful website: https://www.pico.net/kb/what-is-the-difference-between-same-and-valid-padding-in-tf-nn-max-pool-of-tensorflow There, the formula to compute the necessary padding is given. – Christopher Jun 30 '20 at 14:00
  • 1
    I would argue that "same" padding for this case should be 2 to the left and 2 to the right, so the stride=1 output would be size 13. – buped82 Sep 06 '21 at 15:40
202

When stride is 1 (more typical with convolution than pooling), we can think of the following distinction:

  • "SAME": output size is the same as input size. This requires the filter window to slip outside input map, hence the need to pad.
  • "VALID": Filter window stays at valid position inside input map, so output size shrinks by filter_size - 1. No padding occurs.
YvesgereY
  • 3,778
  • 1
  • 20
  • 19
  • 85
    This is finally helpful. Up to this point, it appeared that `SAME` and `VALID` may as well have been called `foo` and `bar` – omatai Jan 25 '18 at 03:33
  • 14
    I think "output size is the **same** as input size" is true only when the stride length is 1. – omsrisagar Jul 16 '18 at 18:19
187

I'll give an example to make it clearer:

  • x: input image of shape [2, 3], 1 channel
  • valid_pad: max pool with 2x2 kernel, stride 2 and VALID padding.
  • same_pad: max pool with 2x2 kernel, stride 2 and SAME padding (this is the classic way to go)

The output shapes are:

  • valid_pad: here, no padding so the output shape is [1, 1]
  • same_pad: here, we pad the image to the shape [2, 4] (with -inf and then apply max pool), so the output shape is [1, 2]

x = tf.constant([[1., 2., 3.],
                 [4., 5., 6.]])

x = tf.reshape(x, [1, 2, 3, 1])  # give a shape accepted by tf.nn.max_pool

valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')

valid_pad.get_shape() == [1, 1, 1, 1]  # valid_pad is [5.]
same_pad.get_shape() == [1, 1, 2, 1]   # same_pad is  [5., 6.]

Olivier Moindrot
  • 27,908
  • 11
  • 92
  • 91
106

The TensorFlow Convolution example gives an overview about the difference between SAME and VALID :

  • For the SAME padding, the output height and width are computed as:

     out_height = ceil(float(in_height) / float(strides[1]))
     out_width  = ceil(float(in_width) / float(strides[2]))
    

And

  • For the VALID padding, the output height and width are computed as:

     out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
     out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))
    
n8yoder
  • 9,530
  • 2
  • 16
  • 19
RoyaumeIX
  • 1,947
  • 4
  • 13
  • 37
87

Complementing YvesgereY's great answer, I found this visualization extremely helpful:

Padding visualization

Padding 'valid' is the first figure. The filter window stays inside the image.

Padding 'same' is the third figure. The output is the same size.


Found it on this article

Visualization credits: vdumoulin@GitHub

zmx
  • 1,051
  • 1
  • 10
  • 12
60

Padding is an operation to increase the size of the input data. In case of 1-dimensional data you just append/prepend the array with a constant, in 2-dim you surround matrix with these constants. In n-dim you surround your n-dim hypercube with the constant. In most of the cases this constant is zero and it is called zero-padding.

Here is an example of zero-padding with p=1 applied to 2-d tensor: enter image description here


You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:

  • VALID padding. The easiest case, means no padding at all. Just leave your data the same it was.
  • SAME padding sometimes called HALF padding. It is called SAME because for a convolution with a stride=1, (or for pooling) it should produce output of the same size as the input. It is called HALF because for a kernel of size k enter image description here
  • FULL padding is the maximum padding which does not result in a convolution over just padded elements. For a kernel of size k, this padding is equal to k - 1.

To use arbitrary padding in TF, you can use tf.pad()

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
38

Quick Explanation

VALID: Don't apply any padding, i.e., assume that all dimensions are valid so that input image fully gets covered by filter and stride you specified.

SAME: Apply padding to input (if needed) so that input image gets fully covered by filter and stride you specified. For stride 1, this will ensure that output image size is same as input.

Notes

  • This applies to conv layers as well as max pool layers in same way
  • The term "valid" is bit of a misnomer because things don't become "invalid" if you drop part of the image. Sometime you might even want that. This should have probably be called NO_PADDING instead.
  • The term "same" is a misnomer too because it only makes sense for stride of 1 when output dimension is same as input dimension. For stride of 2, output dimensions will be half, for example. This should have probably be called AUTO_PADDING instead.
  • In SAME (i.e. auto-pad mode), Tensorflow will try to spread padding evenly on both left and right.
  • In VALID (i.e. no padding mode), Tensorflow will drop right and/or bottom cells if your filter and stride doesn't full cover input image.
Shital Shah
  • 63,284
  • 17
  • 238
  • 185
28

I am quoting this answer from official tensorflow docs https://www.tensorflow.org/api_guides/python/nn#Convolution For the 'SAME' padding, the output height and width are computed as:

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

and the padding on the top and left are computed as:

pad_along_height = max((out_height - 1) * strides[1] +
                    filter_height - in_height, 0)
pad_along_width = max((out_width - 1) * strides[2] +
                   filter_width - in_width, 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

For the 'VALID' padding, the output height and width are computed as:

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

and the padding values are always zero.

Vaibhav Dixit
  • 844
  • 1
  • 9
  • 12
  • 3
    Frankly this is the only valid and complete answer around, not limited to strides of 1. And all it takes is a quote from the docs. +1 – P-Gn May 26 '18 at 18:53
  • 3
    Very useful to have this answer around, specially because the link you point to doesn't work anymore and it seems that Google erased that information from the tf website! – Daniel Mar 04 '20 at 02:28
  • 1
    This should be the answer to the question! indeed the only complete answer. – misha312 Oct 12 '21 at 23:01
13

There are three choices of padding: valid (no padding), same (or half), full. You can find explanations (in Theano) here: http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html

  • Valid or no padding:

The valid padding involves no zero padding, so it covers only the valid input, not including artificially generated zeros. The length of output is ((the length of input) - (k-1)) for the kernel size k if the stride s=1.

  • Same or half padding:

The same padding makes the size of outputs be the same with that of inputs when s=1. If s=1, the number of zeros padded is (k-1).

  • Full padding:

The full padding means that the kernel runs over the whole inputs, so at the ends, the kernel may meet the only one input and zeros else. The number of zeros padded is 2(k-1) if s=1. The length of output is ((the length of input) + (k-1)) if s=1.

Therefore, the number of paddings: (valid) <= (same) <= (full)

12

VALID padding: this is with zero padding. Hope there is no confusion.

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
print (valid_pad.get_shape()) # output-->(1, 2, 1, 1)

SAME padding: This is kind of tricky to understand in the first place because we have to consider two conditions separately as mentioned in the official docs.

Let's take input as , output as , padding as , stride as and kernel size as (only a single dimension is considered)

Case 01: :

Case 02: :

is calculated such that the minimum value which can be taken for padding. Since value of is known, value of can be found using this formula .

Let's work out this example:

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
print (same_pad.get_shape()) # --> output (1, 2, 2, 1)

Here the dimension of x is (3,4). Then if the horizontal direction is taken (3):

If the vertial direction is taken (4):

Hope this will help to understand how actually SAME padding works in TF.

Cabbage soup
  • 1,344
  • 1
  • 18
  • 26
GPrathap
  • 7,336
  • 7
  • 65
  • 83
12

To sum up, 'valid' padding means no padding. The output size of the convolutional layer shrinks depending on the input size & kernel size.

On the contrary, 'same' padding means using padding. When the stride is set as 1, the output size of the convolutional layer maintains as the input size by appending a certain number of '0-border' around the input data when calculating convolution.

Hope this intuitive description helps.

9

Based on the explanation here and following up on Tristan's answer, I usually use these quick functions for sanity checks.

# a function to help us stay clean
def getPaddings(pad_along_height,pad_along_width):
    # if even.. easy..
    if pad_along_height%2 == 0:
        pad_top = pad_along_height / 2
        pad_bottom = pad_top
    # if odd
    else:
        pad_top = np.floor( pad_along_height / 2 )
        pad_bottom = np.floor( pad_along_height / 2 ) +1
    # check if width padding is odd or even
    # if even.. easy..
    if pad_along_width%2 == 0:
        pad_left = pad_along_width / 2
        pad_right= pad_left
    # if odd
    else:
        pad_left = np.floor( pad_along_width / 2 )
        pad_right = np.floor( pad_along_width / 2 ) +1
        #
    return pad_top,pad_bottom,pad_left,pad_right

# strides [image index, y, x, depth]
# padding 'SAME' or 'VALID'
# bottom and right sides always get the one additional padded pixel (if padding is odd)
def getOutputDim (inputWidth,inputHeight,filterWidth,filterHeight,strides,padding):
    if padding == 'SAME':
        out_height = np.ceil(float(inputHeight) / float(strides[1]))
        out_width  = np.ceil(float(inputWidth) / float(strides[2]))
        #
        pad_along_height = ((out_height - 1) * strides[1] + filterHeight - inputHeight)
        pad_along_width = ((out_width - 1) * strides[2] + filterWidth - inputWidth)
        #
        # now get padding
        pad_top,pad_bottom,pad_left,pad_right = getPaddings(pad_along_height,pad_along_width)
        #
        print 'output height', out_height
        print 'output width' , out_width
        print 'total pad along height' , pad_along_height
        print 'total pad along width' , pad_along_width
        print 'pad at top' , pad_top
        print 'pad at bottom' ,pad_bottom
        print 'pad at left' , pad_left
        print 'pad at right' ,pad_right

    elif padding == 'VALID':
        out_height = np.ceil(float(inputHeight - filterHeight + 1) / float(strides[1]))
        out_width  = np.ceil(float(inputWidth - filterWidth + 1) / float(strides[2]))
        #
        print 'output height', out_height
        print 'output width' , out_width
        print 'no padding'


# use like so
getOutputDim (80,80,4,4,[1,1,1,1],'SAME')
ahmedhosny
  • 1,099
  • 14
  • 25
9

Padding on/off. Determines the effective size of your input.

VALID: No padding. Convolution etc. ops are only performed at locations that are "valid", i.e. not too close to the borders of your tensor.
With a kernel of 3x3 and image of 10x10, you would be performing convolution on the 8x8 area inside the borders.

SAME: Padding is provided. Whenever your operation references a neighborhood (no matter how big), zero values are provided when that neighborhood extends outside the original tensor to allow that operation to work also on border values.
With a kernel of 3x3 and image of 10x10, you would be performing convolution on the full 10x10 area.

Laine Mikael
  • 173
  • 2
  • 6
9

General Formula

Here, W and H are width and height of input, F are filter dimensions, P is padding size (i.e., number of rows or columns to be padded)

For SAME padding:

SAME Padding

For VALID padding:

VALID padding

Shivam Kushwaha
  • 990
  • 8
  • 12
2

Tensorflow 2.0 Compatible Answer: Detailed Explanations have been provided above, about "Valid" and "Same" Padding.

However, I will specify different Pooling Functions and their respective Commands in Tensorflow 2.x (>= 2.0), for the benefit of the community.

Functions in 1.x:

tf.nn.max_pool

tf.keras.layers.MaxPool2D

Average Pooling => None in tf.nn, tf.keras.layers.AveragePooling2D

Functions in 2.x:

tf.nn.max_pool if used in 2.x and tf.compat.v1.nn.max_pool_v2 or tf.compat.v2.nn.max_pool, if migrated from 1.x to 2.x.

tf.keras.layers.MaxPool2D if used in 2.x and

tf.compat.v1.keras.layers.MaxPool2D or tf.compat.v1.keras.layers.MaxPooling2D or tf.compat.v2.keras.layers.MaxPool2D or tf.compat.v2.keras.layers.MaxPooling2D, if migrated from 1.x to 2.x.

Average Pooling => tf.nn.avg_pool2d or tf.keras.layers.AveragePooling2D if used in TF 2.x and

tf.compat.v1.nn.avg_pool_v2 or tf.compat.v2.nn.avg_pool or tf.compat.v1.keras.layers.AveragePooling2D or tf.compat.v1.keras.layers.AvgPool2D or tf.compat.v2.keras.layers.AveragePooling2D or tf.compat.v2.keras.layers.AvgPool2D , if migrated from 1.x to 2.x.

For more information about Migration from Tensorflow 1.x to 2.x, please refer to this Migration Guide.

2

valid padding is no padding. same padding is padding in a way the output has the same size as input.

shahab4ai
  • 53
  • 1
  • 6
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/32138300) – Francisco Maria Calisto Jul 06 '22 at 13:23
  • 1
    @FranciscoMariaCalisto you are right I did not read the question until the end. Also thanks for the comment. – shahab4ai Jul 13 '22 at 10:39