I need some help to create a CaffeDB for siamese CNN out of a plain directory with images and label-text-file. Best would be a python-way to do it.
The problem is not to walk through the directory and making pairs of images. My problem is more of making a CaffeDB out of those pairs.
So far I only used convert_imageset
to create a CaffeDB out of an image directory.
Thanks for help!

- 1
- 1

- 571
- 5
- 16
-
what loss layer are you going to use? – Shai Jan 20 '16 at 16:22
-
I don't know yet. For my use case I have some images (100k) for each class (4 + garbage class) and I want the net to distinguish between classes better. With "normal" linear CNN there were to much mistakes of the net and I thought of trying a siamese CNN to make the net learning the differences better. If you have some suggestions on good loss layer, please tell me. – Feuerteufel Jan 21 '16 at 08:50
-
contrastive loss layers seems to be suitable for this use case. – Shai Jan 21 '16 at 08:53
-
thx for that, so problem with caffeDB remains... – Feuerteufel Jan 21 '16 at 10:46
1 Answers
Why don't you simply make two datasets using good old convert_imagest
?
layer {
name: "data_a"
top: "data_a"
top: "label_a"
type: "Data"
data_param { source: "/path/to/first/data_lmdb" }
...
}
layer {
name: "data_b"
top: "data_b"
top: "label_b"
type: "Data"
data_param { source: "/path/to/second/data_lmdb" }
...
}
As for the loss, since every example has a class label you need to convert label_a
and label_b
into a same_not_same_label
. I suggest you do this "on-the-fly" using a python layer. In the prototxt
add the call to python layer:
layer {
name: "a_b_to_same_not_same_label"
type: "Python"
bottom: "label_a"
bottom: "label_b"
top: "same_not_same_label"
python_param {
# the module name -- usually the filename -- that needs to be in $PYTHONPATH
module: "siamese"
# the layer name -- the class name in the module
layer: "SiameseLabels"
}
propagate_down: false
}
Create siamese.py
(make sure it is in your $PYTHONPATH
). In siamese.py
you should have the layer class:
import sys, os
sys.path.insert(0,os.environ['CAFFE_ROOT'] + '/python')
import caffe
class SiameseLabels(caffe.Layer):
def setup(self, bottom, top):
if len(bottom) != 2:
raise Exception('must have exactly two inputs')
if len(top) != 1:
raise Exception('must have exactly one output')
def reshape(self,bottom,top):
top[0].reshape( *bottom[0].shape )
def forward(self,bottom,top):
top[0].data[...] = (bottom[0].data == bottom[1].data).astype('f4')
def backward(self,top,propagate_down,bottom):
# no back prop
pass
Make sure you shuffle the examples in the two sets in a different manner, so you get non-trivial pairs. Moreover, if you construct the first and second data sets with different number of examples, then you will see different pairs at each epoch ;)
Make sure you construct the network to share the weights of the duplicated layers, see this tutorial for more information.

- 33
- 1
- 1
- 7

- 111,146
- 38
- 238
- 371
-
I found no siamese.py file, neither in caffe/python nor in python2.7 install dir. I'm working on Ubuntu 15.04 and got the caffe-master branch in 10/2015. There is only the mnist siamese example and I already designed the net like in the tutorial with shared parameter, only the beginning with the data input is not clear to me. I don't use a python layer so far. I just define the net and run caffe with train command for a given solver.prototxt. Like: caffe train -solver solver.prototxt -gpu all. My data layer refers to directory with *.mdb and the mean binaryproto file – Feuerteufel Jan 21 '16 at 12:37
-
@Feuerteufel you need to **create** a `siamese.py` file and make sure it is in your `$PYTHONPATH`. this file should contain the code in the question (along with the proper `import`s that are required to `import caffe`). If you enabled a python layer in your [Makefile](https://github.com/BVLC/caffe/blob/master/Makefile.config.example#L82) than caffe will run the python code for you as part of its `caffe train`. – Shai Jan 21 '16 at 12:57
-
Ok, python layer was not enabled so I'm rebuilding it right now. The proper lines for imports for the siamese.py are "import sys", "sys.path.insert( 0, 'path/to/caffe/python' )" and "import caffe" or something more? In the loss layer the same_not_same_label is then used as third input? – Feuerteufel Jan 21 '16 at 13:40
-
@Feuerteufel same_not_same_label is used as the label for the contrastive loss. – Shai Jan 21 '16 at 13:48
-
If I have N labels. How can I enforce, that the feature vector of size N right before the contrastive loss layer represents some kind of probability for each class? Or comes that automatically with the siamese net design? – Feuerteufel Jan 22 '16 at 10:33
-
@Feuerteufel that is a big question. I cannot answer this in a comment. consider posting this as a new question – Shai Jan 22 '16 at 10:59
-
ok, http://stackoverflow.com/questions/34946937/how-to-enforce-feature-vector-representing-label-probability-with-caffe-siamese – Feuerteufel Jan 22 '16 at 12:38
-
@Shai I followed your instruction but I get this error: `Check failed: layer_param.propagate_down_size() == layer_param.bottom_size() (1 vs. 2) propagate_down param must be specified either 0 or bottom_size times`, which doesn't make sense because I set `back_propagate: false`. Where else could've gone wrong? – MoneyBall May 21 '17 at 16:33
-
@MoneyBall this is beyond the scope of a comment. please consider asking a new question – Shai May 21 '17 at 16:40
-
@Shai do you by any chance know if there is a C++ code for such layer that I can plug and play? – MoneyBall May 22 '17 at 09:22
-