How to initialize weight for convolution layers in Tensorflow Object Detection API?

Question

I followed this tutorial for implementing Tensorflow Object Detection API.

The preferred way is using pretrained models.

But for some cases, we need to train from scratch.

For that we just need to comment out two lines in the configuration file as

#fine_tune_checkpoint: "object_detection/data/mobilenet_v1_1.0_224/mobilenet_v1_1.0_224.ckpt"
#from_detection_checkpoint: true

If I want to initialize weight with Xavier weight initialization, how can I do that?

score 3 · Accepted Answer · answered Mar 15 '19 at 09:22

As you can see in the configuration protobuf definition, there are 3 initializers you can use:

TruncatedNormalInitializer truncated_normal_initializer
VarianceScalingInitializer variance_scaling_initializer
RandomNormalInitializer random_normal_initializer

The VarianceScalingInitializer is what you are looking for. It is general initializer which you can basically turn into Xavier initializer by setting factor=1.0, mode='FAN_AVG', as stated in the documentation.

So, by setting the initializers as

initializer {
    variance_scaling_initializer {
        factor: 1.0
        uniform: true
        mode: FAN_AVG
    }
}

in your configuration, you obtain Xavier initializer.

But also, even if you need to train on new data, consider using pretrained network as initialization instead of random initialization. For more details, see this article.

score 0 · Answer 2 · answered Mar 15 '19 at 09:18

The mobilenet_v1 feature extractor imports the backbone network from research/slim/nets:

25:   from nets import mobilenet_v1

The code of mobilenet instantiates the layers according to the specification like this:

net = slim.conv2d(net, depth(conv_def.depth), conv_def.kernel, stride=conv_def.stride, scope=end_point)

See https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.py#L264

As you can see, there are no kwargs passed to the conv2d call, so with the current code you cannot specify which weights_initializer will be used.

However, by default the initializer is Xavier anyway, so you are lucky.

I must say that training and object detection model without pre-training the feature extractor on some auxiliary task may simply fail.

I'm afraid he wants to use the Object Detection API, and not directly the mobilenet, and there the training parameters are not set using the source code. — Matěj Račinský, Mar 15 '19 at 09:23
Yes my training from scratch failed. But I did the same thing initialization with Xavier using Caffe made good results. — batuman, Mar 15 '19 at 09:29

How to initialize weight for convolution layers in Tensorflow Object Detection API?

2 Answers2

Linked