How to train a FCN network while the size of images are not fixed and they are varying?

Question

I have already trained the FCN model with fixed size images 256x256. Could I ask from experts how can I train the same model once the size of image are changing from one image to another image?

I really appreciate your advice. Thanks

Shai · Accepted Answer · 2018-01-24T12:13:46.843

6

You can choose one of these strategies:

1. Batch = 1 image

By training each image as a different batch, you can reshape the net in the forward() (rather than in reshape()) of the data layer, thus changing the net at each iteration.
+write reshape once in forward method and you no longer need to worry about input shapes and sizes.

-reshapeing the net often requires allocation/deallocation of CPU/GPU memory and therefore it takes time.
-You might find a single image in a batch to be too small of a batch.

For example (assuming you are using a "Python" layer for input):

def reshape(self, bottom, top):
  pass  # you do not reshape here.

def forward(self, bottom, top):
  top[0].data.reshape( ... )  # reshape the blob - this will propagate the reshape to the rest of the net at each iteration
  top[1].data.reshape( ... )  # 

  # feed the data to the net      
  top[0].data[...] = current_img
  top[1].data[...] = current_label

2. Random crops

You can decide on a fixed input size and then randomly crop all input images (and the corresponding ground truths).
+No need to reshape every iteration (faster).
+Control over model size during train.

-Need to implement random crops for images and labels

3. Fixed size

Resize all images to the same size (like in SSD).
+Simple

-Images are distorted if not all images have the same aspect ratio.
-You are no invariant to scale

edited Jan 24 '18 at 12:13

answered Jan 24 '18 at 08:45

Shai

111,146
38
238
371

Thank a lot for your help, should I add reshape layer after `Data` layer and before `Convolution` layer for both `data` and `label`? Is the following layer correct if `height` and `width` are changing?` layer { name: "my_reshape" type: "Reshape" bottom: "data" top: "reshape_in" reshape_param { shape: {dim: 0 dim: 0 dim: -1 dim: -1 } } }` – S.EB Jan 24 '18 at 09:54
Could I ask how can I change `deploy.prototxt`? – S.EB Jan 24 '18 at 10:05
@S.EB your data layer should do that for you. in the code you linked there is a python layer that handles inputs – Shai Jan 24 '18 at 10:08
Sir, could you please explain the first option more? I thought you are mentioning [Reshape Layer](http://caffe.berkeleyvision.org/tutorial/layers/reshape.html). Thanks – S.EB Jan 24 '18 at 10:42
Thank you so much for your help. Could I ask one more question? Where should I place the `Python` Layer in the model definition, after `Data` layers? apologies for asking lots of questions. Thanks once again – S.EB Jan 24 '18 at 11:03
I assumed the `"Python"` layer **replaces** the `"Data"` layer. Coming to think of it. If you build an `lmdb` with images/labels I suppose you can have a different size for each image and then the `"Data"` layer handles everything for you (I haven't tried it myself). – Shai Jan 24 '18 at 11:15
Thanks a lot, I will try and post it here. – S.EB Jan 24 '18 at 12:11
@Shai: Crazy enough, in the past year I somehow used all of these methods! – Alex Jun 04 '18 at 22:40
1

@Shai: I mostly used (3), but now I'm developing a new model that samples data from a single image only (like Mask R-CNN), so switched to (1) – Alex Jun 05 '18 at 09:58
@Shai: it seems that even when minibatch=1, validation size image has to be the same size as train: otherwise fully connected layers report an error: Input size incompatible with inner product parameters. – Alex Jun 05 '18 at 22:51
@Shai: good point. Unfortunately I didn't find a better way to make a prediction – Alex Jun 06 '18 at 07:50
@Alex look at net_surgery tutorial: they convert innerproduct layer to conv layer. you can predict from conv outputs as well – Shai Jun 06 '18 at 08:18

How to train a FCN network while the size of images are not fixed and they are varying?

1 Answers1

1. Batch = 1 image

2. Random crops

3. Fixed size