3

I am a beginner and learning deep learning with baby steps. I have a question about designing the nets. I see in the papers, there are layers with different inputs/outputs and I do not know how to calculate/design before implementation. For instance, in this paper, there are some numbers beside the schematic layers output (see the following figure). How these filter size and other parameters are being specified for a network with a specific image size as input. enter image description here

or in another paper, they have the following design: enter image description here

and they have mentioned, For a 256x256 input image, the total sub-sampling factor of the network is 4, resulting in a 64x64xL array, where L is the number of class labels. How this 64x64 size is being obtained?

How can I learn to design the net and calculates inputs/outputs of layers?

Thank you for any help

Shai
  • 111,146
  • 38
  • 238
  • 371
S.EB
  • 1,966
  • 4
  • 29
  • 54

1 Answers1

3
  1. If you pool twice with stride=2 means you reduce by a factor 2 the image size twice, resulting with a total of x4 reduction (sub sampling) of image size. Hence, if you start with an image of size 256: 256/4=64.

  2. How do choose kernel size, number of output of each layer, strides and other design parameters? There's actually no single answer for that and basically many papers/works approach the same tasks with different settings. AFAIK there's no clear guidelines or obvious choice of parameters that suits any specific task.
    That being said, you can find this work surveying some emerging deep nets design patterns.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • Thank you very much Shai, May I ask how the filters are calculated in column #3 of the table, as you can see 64,128, 256,512,512,1024,39 ? Thanks – S.EB Feb 23 '17 at 17:25
  • @S.EB you might as well chose other numbers. – Shai Feb 23 '17 at 19:22
  • Dear Sai, Thank you so much. I read that paper. Really happy to get good information. Thanks for sharing. – S.EB Feb 23 '17 at 19:45