1

I am training a CNN in Caffe, whose output is either one of two classes (a binary problem).

I am using a ImageData layer as input layer, passing two .txt with the training and validation set of images. These files are balanced, i.e., the number of examples are the same for both classes. In this layer, I am also using the "shuffle" parameter.

Regarding this, I have two doubts:

1. How the batch is sampled/selected from the .txt files?

Is it constructed by getting the first N examples (let's say N is the batch size) of the file, shuffling them and feeding them to the network? In this sense, the batch itself may not be balanced. Does this affect the training/fine-tuning?

Another way would be to randomly sample N/2 examples from one class and N/2 from the other, but I don't think Caffe does this.

2. The order of the examples in the .txt files matter to how the batch is constructed?

Would it be a good idea to build the .txt file in a way the batch would be balanced (for example, every odd line is of one class and every even is of another)?

Thanks for your help!

aschipfl
  • 33,626
  • 12
  • 54
  • 99
rafaspadilha
  • 629
  • 6
  • 20

2 Answers2

2

(1) Yes, shuffle will randomize the order of input examples, provided that the examples are appropriately delineated -- such as with line feeds to separate sentences into separate examples. Caffe does not bother with balancing each batch by class.

This has a minor effect on training, but should even out in the long run. The important thing is to have each example presented exactly once per epoch.

(2) Pre-balancing won't matter: shuffle changes the order as it sees fit (random number generation).

Prune
  • 76,765
  • 14
  • 60
  • 81
2

You have two options:

1- Pre-balance the data, and disable shuffle.

2- Create your own batches on the fly: In python, you can create your own batch as a numpy array and feed it into the network. Check this post to see how to input data using python interface. In this scenario, you can create any batch that meets your needs, and you can balance it as well. when using the deploy solution (3rd solution in the given post), you can feed in the data as follows: When using the deploy version, you can set the input data for your network like this:

x = data;
y = labels;
solver.net.blobs['data'].data[...] = x
net.blobs['label'].data[...] = y

You can then call solver.net.step(1) to run the network for one iteration (forward + backpropagation).

Community
  • 1
  • 1
Amir
  • 2,259
  • 1
  • 19
  • 29