How to use He initialization in TensorFlow

Question

He / MSRA initialization, from Delving Deep into Rectifiers, seems to be a recommended weight initialization when using ReLUs.

Is there a built-in way to use this in TensorFlow? (similar to: How to do Xavier initialization on TensorFlow)?

matwilso · Accepted Answer · 2020-12-02T16:15:56.897

11

TensorFlow 2.0

tf.keras.initializers.HeUniform()

or

tf.keras.initializers.HeNormal()

See docs for usage. (h/t to @mable)

TensorFlow 1.0

tf.contrib.layers.variance_scaling_initializer(dtype=tf.float32)

This will give you He / MRSA initialization. The documentation states that the default arguments for tf.contrib.layers.variance_scaling_initializer correspond to He initialization and that changing the arguments can yield Xavier initialization (this is what is done in TF's internal implementation for Xavier initialization).

Example usage:

W1 = tf.get_variable('W1', shape=[784, 256],
       initializer=tf.contrib.layers.variance_scaling_initializer())

or

initializer = tf.contrib.layers.variance_scaling_initializer()
W1 = tf.Variable(initializer([784,256]))

edited Dec 02 '20 at 16:15

answered Aug 14 '18 at 20:07

matwilso

2,924
3
17
24

1

For all who stumble across this: By now, the activation is available in `tf.keras.initializers.HeNormal` (or `tf.keras.initializers.VarianceScaling` using default parameters) – mable Dec 02 '20 at 11:52
Could you please explain what is the difference between the `tf.keras.initializers.HeUniform()` and `tf.keras.initializers.HeNormal()`? – Hong Cheng Jul 01 '21 at 01:00
HeUniform draws the weights from a Uniform distribution: U(-x, x). HeNormal draws them from a Normal distribution N(0, x), or something like that, where x is a small value determined by He or Xavier of what not. The documentation should explain it. – matwilso Jul 01 '21 at 03:38

How to use He initialization in TensorFlow

1 Answers1

TensorFlow 2.0

TensorFlow 1.0