4

I am trying to train a fully convolutional network for my problem. I am using the implementation https://github.com/shelhamer/fcn.berkeleyvision.org . I have different image sizes.

  1. I am not sure how to set the 'Offset' param in the 'Crop' layer.
  2. What are the default values for the 'Offset' param?
  3. How to use this param to crop the images around the center?
user570593
  • 3,420
  • 12
  • 56
  • 91

1 Answers1

8

According to the Crop layer documentation, it takes two bottom blobs and outputs one top blob. Let's call the bottom blobs as A and B, the top blob as T.

A -> 32 x 3 x 224 x 224
B -> 32 x m x n x p

Then,

T -> 32 x m x n x p

Regarding axis parameter, from docs:

Takes a Blob and crop it, to the shape specified by the second input Blob, across all dimensions after the specified axis.

which means, if we set axis = 1, then it will crop dimensions 1, 2, 3. If axis = 2, then T would have been of the size 32 x 3 x n x p. You can also set axis to a negative value, such as -1, which would mean the last dimension, i.e. 3 in this case.

Regarding offset parameter, I checked out $CAFFE_ROOT/src/caffe/proto/caffe.proto (on line 630), I did not find any default value for offset parameter, so I assume that you have to provide that parameter, otherwise it will result in an error. However, I may be wrong.

Now, Caffe knows that you need a blob of size m on the first axis. We still need to tell Caffe from where to crop. That's where offset comes in. If offset is 10, then your blob of size m will be cropped starting from 10 and end at 10+m-1 (for a total of size m). Set one value for offset to crop by that amount in all the dimensions (which are determined by axis, remember? In this case 1, 2, 3). Otherwise, if you want to crop each dimension differently, you have to specify number of offsets equal to the number of dimensions being cropped (in this case 3). So to sum up all,

If you have a blob of size 32 x 3 x 224 x 224 and you want to crop a center part of size 32 x 3 x 32 x 64, then you would write the crop layer as follows:

layer {
  name: "T"
  type: "Crop"
  bottom: "A"
  bottom: "B"
  top: "T"
  crop_param {
      axis: 2
      offset: 96
      offset: 80
  }
}
Autonomous
  • 8,935
  • 1
  • 38
  • 77
  • Thanks for the answer. Seems the cropping is implemented in https://github.com/BVLC/caffe/blob/master/python/caffe/coord_map.py. I didnt understand everything as I am not very familiar with Python. But it seems the default parameters are there. In my understanding the Fully convolutional CNN takes a fixed sized input. If the input sizes are different there will be a problem with handling the crop parameters. Am I correct? – user570593 Jul 26 '16 at 14:18
  • 1
    @user570593 I am not sure how FCNN works. However, crop layer takes two bottom blobs. If one of them is of varying size and the other one (the reference blob, `B` in above example) is of fixed size, then the varying size blob will be cropped to be of the same size as `B`, always. This should not pose a problem then. Of course, you are losing some information, when you crop. So FCNN must have some way of handling that. I think when they write `params.get('offset', 0)` (on [line 46](github.com/BVLC/caffe/blob/master/python/caffe/coord_map.py)), I think they assign a default value = 0 to offset. – Autonomous Jul 26 '16 at 15:01
  • I believe the default value is indeed `0`, i.e. from the very beginning corner. But the correct reference might be [here](https://github.com/BVLC/caffe/blob/master/src/caffe/layers/crop_layer.cpp#L50), which initializes the value to 0. – wlnirvana Aug 16 '17 at 05:47