11

i am extracting 30 facial keypoints (x,y) from an input image as per kaggle facialkeypoints competition.

How do i setup caffe to run a regression and produce 30 dimensional output??.

Input: 96x96 image
Output: 30 - (30 dimensions).

How do i setup caffe accordingly?. I am using EUCLIDEAN_LOSS (sum of squares) to get the regressed output. Here is a simple logistic regressor model using caffe but it is not working. Looks accuracy layer cannot handle multi-label output.

I0120 17:51:27.039113  4113 net.cpp:394] accuracy <- label_fkp_1_split_1
I0120 17:51:27.039135  4113 net.cpp:356] accuracy -> accuracy
I0120 17:51:27.039158  4113 net.cpp:96] Setting up accuracy
F0120 17:51:27.039201  4113 accuracy_layer.cpp:26] Check failed: bottom[1]->channels() == 1 (30 vs. 1) 
*** Check failure stack trace: ***
    @     0x7f7c2711bdaa  (unknown)
    @     0x7f7c2711bce4  (unknown)
    @     0x7f7c2711b6e6  (unknown)

Here is the layer file:

name: "LogReg"
layers {
  name: "fkp"
  top: "data"
  top: "label"
  type: HDF5_DATA
  hdf5_data_param {
   source: "train.txt"
   batch_size: 100
  }
    include: { phase: TRAIN }

}

layers {
  name: "fkp"
  type: HDF5_DATA
  top: "data"
  top: "label"
  hdf5_data_param {
    source: "test.txt"
    batch_size: 100
  }

  include: { phase: TEST }
}

layers {
  name: "ip"
  type: INNER_PRODUCT
  bottom: "data"
  top: "ip"
  inner_product_param {
    num_output: 30
  }
}
layers {
  name: "loss"
  type: EUCLIDEAN_LOSS
  bottom: "ip"
  bottom: "label"
  top: "loss"
}

layers {
  name: "accuracy"
  type: ACCURACY
  bottom: "ip"
  bottom: "label"
  top: "accuracy"
  include: { phase: TEST }
}
cchamberlain
  • 17,444
  • 7
  • 59
  • 72
pbu
  • 2,982
  • 8
  • 44
  • 68

1 Answers1

4

i found it :)

I replaced the SOFTLAYER to EUCLIDEAN_LOSS function and changed the number of outputs. It worked.

layers {
  name: "loss"
  type: EUCLIDEAN_LOSS
  bottom: "ip1"
  bottom: "label"
  top: "loss"
}

HINGE_LOSS is also another option.

pbu
  • 2,982
  • 8
  • 44
  • 68
  • what was your change in number of outputs? – nayef Feb 18 '15 at 13:25
  • i reshaped the inputs to (total, 1, 96,96) and output labels to (total, 30) – pbu Feb 18 '15 at 18:45
  • Can you please explain in more details? Why you avoid batch mode and took just one example and why did you change your labels to 30? – thetna Jun 24 '15 at 16:17
  • @pbu Could you explain what you did for accuracy layer? Also your initial .prototxt file in the question already had EUCLIDEAN_LOSS as type. Please post the final .prototxt here. It would help much. Thanks – iamprem Mar 20 '16 at 03:44
  • http://corpocrat.com/2015/02/24/facial-keypoints-extraction-using-deep-learning-with-caffe/ – pbu Mar 20 '16 at 08:40
  • How do you process the output when finished with training? I can see your labels are (total, 30) but this means your labels are 1D rather than 2D (x,y = coordinate values) @pbu –  Dec 01 '16 at 20:47