I have a dataset with each image having around 101 labels. I know, I have to use HDF5 data layer to feed my data into the network. But the problem is that I have a multi task setup. My network has shared parameters for the first 5 layers and then branches off. Out of 101 labels, I want to send 100 labels to one task and 1 label to the second task.
Now, How do I do this ? Can I somehow do the following :
layer {
name: "data"
type: "HDF5Data"
top: "data"
top: "label1" ############# A scalar label
top : "label2" ######## A vector of size 100
include {
phase: TRAIN
}
hdf5_data_param {
source: "path/to/the/text/file/test.txt"
batch_size: 10
}
}
There are two top blobs in the above setup. One for the 100 dimensional vector (label2
) and the other for the remaining label (label1
).
IS THIS KIND OF A SETUP POSSIBLE ?
I also read somewhere that one can split the multi dimensional vector specifying the split specifications in the prototxt file itself. In that case I would have to use a single top blob for label (101 dimensional) and then somehow split the
101-d vector in two vectors of 100-d and 1-d (scalar). How can this be done?
The layer in that case would like :
layer {
name: "data"
type: "HDF5Data"
top: "data"
top : "label" ######## A vector of size 101
include {
phase: TRAIN
}
hdf5_data_param {
source: "path/to/the/text/file/test.txt"
batch_size: 10
}
}
## Some layer to split the label blob into two vectors of 100-d and 1-d respectively
Any Idea of how this split may work ?