8

I want to fine-tune GoogLeNet to do Multi-label classification with Caffe. I have already fine-tuned it to a single-label classification, but I can't make the transition to Multi-label yet.

The main steps I am doing which are different:

Create LMDB for Data & Ground Truth

I am modifying the code here and here to create one LMDB with the Data and the other with ground truth.

Replacing SoftmaxWithLoss with SigmoidCrossEntropyLoss

Updating the train_val.prototxt, I replace SoftmaxWithLoss layers to SigmoidCrossEntropyLoss, and set the data layers so that both DB's are loaded. I set the learning rate parameters as I have done with the single-label classification problem.

This steps seems to be working. The data flow, and it is possible to perform solver.step(1). To verify the data and labels are loaded right, I have explicitly calculated the loss using the formula, and got the same result as Caffe.

Problem

The network does not converge. Running it several hundrads of iterations, each of the different classes averages around the class population. That is if class a has 0.35 1's and 0.65 0's in the population, the network will converge to ~0.35 classification probability for each observation, regardless of the true label.

Possible error 1

I suspect the problem is because I fail to load the images correctly into caffe in a way that GoogLeNet pretrained model can learn from them. My previous experience so far is convert_imageset which works perfectly. Right now I am using shelhamer code to save the images into the LMDB:

im = np.array(Image.open(os.path.join(data_dir,in_)))
im = im[:,:,::-1]
im = im.transpose((2,0,1))
im_dat = caffe.io.array_to_datum(im)
in_txn.put('{:0>10d}'.format(in_idx), im_dat.SerializeToString())

I normalize the mean in the data layer when loading the image. Does that seem right? Is there another way to do that?

Possible error 2

It might also be that the train_val.prototxt has been defined wrong. Is there any thing else that need to be done than switching the SoftmaxWithLoss -> SigmoidCrossEntropyLoss?

Any help will be greatly appreciated! Thanks!

Community
  • 1
  • 1
YotamH
  • 95
  • 8
  • 3
    it seems like your net get stuck with an "all 1" prediction. It might happen if your gradients are too high driving the parameters into meaningless regions. Can you plot a graph of training loss vs iteration number? I would try reducing learning rate by one or two *orders of magnitude* and re-train, see if the model is stuck again. – Shai Oct 20 '15 at 06:33
  • 1
    If your network is not converging, you should check the learning rate. Generally for fine tuning, you should have a lower learning rate (or a high learning rate to begin with and a rapid decay). If the train loss is increasing over epochs, it is an indication that your learning rate is too high. – user3334059 Apr 14 '16 at 15:10
  • Have you seen this issue? https://github.com/BVLC/caffe/issues/2407 – ginge Jan 22 '17 at 15:58

1 Answers1

0

In GoogLeNet input data should be subtracted by mean:

...
im = im.transpose((2,0,1))
mean = np.array((104,117,123))
im -= mean
im_dat = caffe.io.array_to_datum(im)
...
Ilya Ovodov
  • 373
  • 1
  • 10