12

What is the difference between the method that define layers in __init__() function, call layer in forward later and the method that directly use layer in forward() function ?
Should I define every layer in my compute graph in constructed function(eg. __init__) before I write my compute graph?
Could I direct define and use them in forward()?

ᴀʀᴍᴀɴ
  • 4,443
  • 8
  • 37
  • 57
AlphaGoMK
  • 181
  • 2
  • 14

1 Answers1

14

Everything which contains weights which you want to be trained during the training process should be defined in your __init__ method.

You don't need do define activation functions like softmax, ReLU or sigmoid in your __init__, you can just call them in forward.

Dropout layers for example also don't need to be defined in __init__, they can just be called in forward too. [However defining them in your __init__ has the advantage that they can be switched off easier during evaluation (by calling eval() on your model). You can see an example of both versions here.

Hope this is clear. Just ask if you have any further questions.

MBT
  • 21,733
  • 19
  • 84
  • 102
  • That means if a layer have train-able parameters, I should put it into `__init__` function? I noticed that softmax, relu and sigmoid don't have any train-able parameters, so I got this question. – AlphaGoMK May 29 '18 at 08:20
  • Yes, you need to initialize everything with train-able parameters in your `__init__` - this is not the case for activations like softmax, relu or sigmoid. – MBT May 30 '18 at 08:04
  • Thanks, it really helps a lot. – AlphaGoMK May 31 '18 at 09:31