I am training a CNN in Caffe, whose output is either one of two classes (a binary problem).
I am using a ImageData layer as input layer, passing two .txt with the training and validation set of images. These files are balanced, i.e., the number of examples are the same for both classes. In this layer, I am also using the "shuffle" parameter.
Regarding this, I have two doubts:
1. How the batch is sampled/selected from the .txt files?
Is it constructed by getting the first N examples (let's say N is the batch size) of the file, shuffling them and feeding them to the network? In this sense, the batch itself may not be balanced. Does this affect the training/fine-tuning?
Another way would be to randomly sample N/2 examples from one class and N/2 from the other, but I don't think Caffe does this.
2. The order of the examples in the .txt files matter to how the batch is constructed?
Would it be a good idea to build the .txt file in a way the batch would be balanced (for example, every odd line is of one class and every even is of another)?
Thanks for your help!