8

This is my train.prototxt. And this is my deploy.prototxt.

When I want to load my deploy file I get this error:

File "./python/caffe/classifier.py", line 29, in __init__  
in_ = self.inputs[0]  
IndexError: list index out of range  

So, I removed the data layer:

F1117 23:16:09.485153 21910 insert_splits.cpp:35] Unknown bottom blob 'data' (layer 'conv1', bottom index 0)
*** Check failure stack trace: ***

Than, I removed bottom: "data" from conv1 layer.

After it, I got this error:

F1117 23:17:15.363919 21935 insert_splits.cpp:35] Unknown bottom blob 'label' (layer 'loss', bottom index 1)
*** Check failure stack trace: ***

I removed bottom: "label" from loss layer. And I got this error:

I1117 23:19:11.171021 21962 layer_factory.hpp:76] Creating layer conv1
I1117 23:19:11.171036 21962 net.cpp:110] Creating Layer conv1
I1117 23:19:11.171041 21962 net.cpp:433] conv1 -> conv1
F1117 23:19:11.171061 21962 layer.hpp:379] Check failed: MinBottomBlobs() <= bottom.size() (1 vs. 0) Convolution Layer takes at least 1 bottom blob(s) as input.
*** Check failure stack trace: ***

What should I do to fix it and create my deploy file?

Shai
  • 111,146
  • 38
  • 238
  • 371
Carlos Porta
  • 1,224
  • 5
  • 19
  • 31

2 Answers2

17

There are two main differences between a "train" prototxt and a "deploy" one:

1. Inputs: While for training data is fixed to a pre-processed training dataset (lmdb/HDF5 etc.), deploying the net require it to process other inputs in a more "random" fashion.
Therefore, the first change is to remove the input layers (layers that push "data" and "labels" during TRAIN and TEST phases). To replace the input layers you need to add the following declaration:

input: "data"
input_shape: { dim:1 dim:3 dim:224 dim:224 }

This declaration does not provide the actual data for the net, but it tells the net what shape to expect, allowing caffe to pre-allocate necessary resources.

2. Loss: the top most layers in a training prototxt define the loss function for the training. This usually involve the ground truth labels. When deploying the net, you no longer have access to these labels. Thus loss layers should be converted to "prediction" outputs. For example, a "SoftmaxWithLoss" layer should be converted to a simple "Softmax" layer that outputs class probability instead of log-likelihood loss. Some other loss layers already have predictions as inputs, thus it is sufficient just to remove them.

Update: see this tutorial for more information.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • 1
    @0x1337 in order to define the `shape` of the input `"data"` we use [`'BlobShape'`](https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L6) proto message. This `shape` has a "repeated" parameter `dim` that defines one dimension of the `shape`. `dim:1` means we expect `'data'` , at deploy stage, to include only one sample at a time (i.e., `batch_size:1`). – Shai Feb 22 '16 at 14:29
  • 1
    @Shai Thanks for a clear explanation. Is there a way to programmatically generate the deploy.prototxt in python similar? More details about my question are here -- http://stackoverflow.com/questions/40986009/how-to-programmatically-generate-deploy-txt-for-caffe-in-python – cdeepakroy Dec 07 '16 at 20:42
  • @Shai, what difference does it make to consider larger `batch_size` during test? Does it have any effect on accuracy for testing? or it is only for training process? – Farid Alijani Apr 16 '20 at 13:34
  • @FäridAlijani it has no effect on test time accuracy – Shai Apr 16 '20 at 13:54
  • @Shai, so I have `net.blobs[net.inputs[0]].reshape(batch_sz, ch, height, width) # n,C,H,W -> 1,C,H,W` do I always assume `batch_size = 1` is the easiest w.r.t memory consumption and time efficiency? – Farid Alijani Apr 16 '20 at 15:05
0

Besides advices from @Shai, you may also want to disable the dropout layers.

Although Jia Yangqing, author of Caffe once said that dropout layers have negligible impact on the testing results (google group conversation, 2014), other Deeplearning tools suggest to disable dropout in the deploy phase (for example, lasange).

True
  • 419
  • 3
  • 7
  • 1
    it depends on the way your dropout layer is implemented. In some cases you need to replace the dropout with a scaling to compensate for the increase energy of the signal that is not dropped: For instance if you drop 50%, then in test time you have x2 more signal strength passing through the layer, you need to scale down the output by 50%. – Shai Jul 28 '16 at 10:49