CNN: input stride vs. output stride

Question

In the paper 'Fully Convolutional Networks for Semantic Segmentation' the author distinguishes between input stride and output stride in the context of deconvolution. How do these terms differ from each other?

Shamane Siriwardhana · Accepted Answer · 2017-07-27T06:49:48.640

Input stride is the stride of the filter . How much you shift the filter in the output .

Output Stride this is actually a nominal value . We get feature map in a CNN after doing several convolution , max-pooling operations . Let's say our input image is 224 * 224 and our final feature map is 7*7 .

Then we say our output stride is : 224/7 = 32 (Approximate of what happened to the image after down sampling .)

This tensorflow script describe what is this output stride , and how to use in FCN which is the case of dense prediction .

one uses inputs with spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In this case the feature maps at the ResNet output will have spatial shape [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1] and corners exactly aligned with the input image corners, which greatly facilitates alignment of the features to the image. Using as input [225, 225] images results in [8, 8] feature maps at the output of the last ResNet block.

hi @Shamane, thanks for this answer. Unfortunately, the script is not available anymore. could you re-provide it ? — desmond13, Nov 18 '21 at 10:28

CNN: input stride vs. output stride

1 Answers1

Linked