I am trying to train a binary classification model in caffe that tells if an input image is a dog or background. I have 8223 positive samples and 33472 negative samples. My validation set contains 1200 samples, 600 of each class. In fact, my positives are snippets taken from MS-COCO dataset. All images are resized so the biiger dimension does not exceed 92 and the smaller dimension is not smaller than 44. After creating the LMDB files using create_imagenet.sh (resize=false), I started training with the solver and train .prototxt's below. The problem is that I am getting a constant accuracy (0.513333 or 0.486667) which indicates that the network is not learning anything. I hope that someone is able to help Thank you in advanced
solver file:
iter_size: 32
test_iter: 600
test_interval: 20
base_lr: 0.001
display: 2
max_iter: 20000
lr_policy: "step"
gamma: 0.99
stepsize: 700
momentum: 0.9
weight_decay: 0.0001
snapshot: 40
snapshot_prefix: "/media/DATA/classifiers_data/dog_object/models/"
solver_mode: GPU
net: "/media/DATA/classifiers_data/dog_object/net.prototxt"
solver_type: ADAM
train.prototxt:
layer {
name: "train-data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
data_param {
source: "/media/DATA/classifiers_data/dog_object/ilsvrc12_train_lmdb"
batch_size: 1
backend: LMDB
}
}
layer {
name: "val-data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
data_param {
source: "/media/DATA/classifiers_data/dog_object/ilsvrc12_val_lmdb"
batch_size: 1
backend: LMDB
}
}
layer {
name: "scale"
type: "Power"
bottom: "data"
top: "scale"
power_param {
scale: 0.00390625
}
}
layer {
bottom: "scale"
top: "conv1_1"
name: "conv1_1"
type: "Convolution"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
bottom: "conv1_1"
top: "conv1_1"
name: "relu1_1"
type: "ReLU"
}
layer {
bottom: "conv1_1"
top: "conv1_2"
name: "conv1_2"
type: "Convolution"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
bottom: "conv1_2"
top: "conv1_2"
name: "relu1_2"
type: "ReLU"
}
layer {
name: "spatial_pyramid_pooling"
type: "SPP"
bottom: "conv1_2"
top: "spatial_pyramid_pooling"
spp_param {
pool: MAX
pyramid_height : 4
}
}
layer {
bottom: "spatial_pyramid_pooling"
top: "fc6"
name: "fc6"
type: "InnerProduct"
inner_product_param {
num_output: 64
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
bottom: "fc6"
top: "fc6"
name: "relu6"
type: "ReLU"
}
layer {
bottom: "fc6"
top: "fc6"
name: "drop6"
type: "Dropout"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
bottom: "fc6"
top: "fc7"
name: "fc7"
type: "InnerProduct"
inner_product_param {
num_output: 2
}
param {
lr_mult: 1
}
param {
lr_mult: 1
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc7"
bottom: "label"
top: "loss"
}
layer {
name: "accuracy/top1"
type: "Accuracy"
bottom: "fc7"
bottom: "label"
top: "accuracy"
include: { phase: TEST }
}
part of training log:
I1125 15:52:36.604038 2326 solver.cpp:362] Iteration 40, Testing net (#0)
I1125 15:52:36.604071 2326 net.cpp:723] Ignoring source layer train-data
I1125 15:52:47.127979 2326 solver.cpp:429] Test net output #0: accuracy = 0.486667
I1125 15:52:47.128067 2326 solver.cpp:429] Test net output #1: loss = 0.694894 (* 1 = 0.694894 loss)
I1125 15:52:48.937928 2326 solver.cpp:242] Iteration 40 (0.141947 iter/s, 14.0897s/2 iter), loss = 0.67717
I1125 15:52:48.938014 2326 solver.cpp:261] Train net output #0: loss = 0.655692 (* 1 = 0.655692 loss)
I1125 15:52:48.938040 2326 sgd_solver.cpp:106] Iteration 40, lr = 0.001
I1125 15:52:52.858757 2326 solver.cpp:242] Iteration 42 (0.510097 iter/s, 3.92083s/2 iter), loss = 0.673962
I1125 15:52:52.858841 2326 solver.cpp:261] Train net output #0: loss = 0.653978 (* 1 = 0.653978 loss)
I1125 15:52:52.858875 2326 sgd_solver.cpp:106] Iteration 42, lr = 0.001
I1125 15:52:56.581573 2326 solver.cpp:242] Iteration 44 (0.53723 iter/s, 3.7228s/2 iter), loss = 0.673144
I1125 15:52:56.581656 2326 solver.cpp:261] Train net output #0: loss = 0.652269 (* 1 = 0.652269 loss)
I1125 15:52:56.581689 2326 sgd_solver.cpp:106] Iteration 44, lr = 0.001
I1125 15:53:00.192082 2326 solver.cpp:242] Iteration 46 (0.553941 iter/s, 3.61049s/2 iter), loss = 0.669606
I1125 15:53:00.192167 2326 solver.cpp:261] Train net output #0: loss = 0.650559 (* 1 = 0.650559 loss)
I1125 15:53:00.192200 2326 sgd_solver.cpp:106] Iteration 46, lr = 0.001
I1125 15:53:04.195417 2326 solver.cpp:242] Iteration 48 (0.499585 iter/s, 4.00332s/2 iter), loss = 0.674327
I1125 15:53:04.195691 2326 solver.cpp:261] Train net output #0: loss = 0.648808 (* 1 = 0.648808 loss)
I1125 15:53:04.195736 2326 sgd_solver.cpp:106] Iteration 48, lr = 0.001
I1125 15:53:07.856842 2326 solver.cpp:242] Iteration 50 (0.546265 iter/s, 3.66123s/2 iter), loss = 0.661835
I1125 15:53:07.856925 2326 solver.cpp:261] Train net output #0: loss = 0.647097 (* 1 = 0.647097 loss)
I1125 15:53:07.856957 2326 sgd_solver.cpp:106] Iteration 50, lr = 0.001
I1125 15:53:11.681635 2326 solver.cpp:242] Iteration 52 (0.522906 iter/s, 3.82478s/2 iter), loss = 0.66071
I1125 15:53:11.681720 2326 solver.cpp:261] Train net output #0: loss = 0.743264 (* 1 = 0.743264 loss)
I1125 15:53:11.681754 2326 sgd_solver.cpp:106] Iteration 52, lr = 0.001
I1125 15:53:15.544859 2326 solver.cpp:242] Iteration 54 (0.517707 iter/s, 3.86319s/2 iter), loss = 0.656414
I1125 15:53:15.544950 2326 solver.cpp:261] Train net output #0: loss = 0.643741 (* 1 = 0.643741 loss)
I1125 15:53:15.544986 2326 sgd_solver.cpp:106] Iteration 54, lr = 0.001
I1125 15:53:19.354320 2326 solver.cpp:242] Iteration 56 (0.525012 iter/s, 3.80943s/2 iter), loss = 0.645277
I1125 15:53:19.354404 2326 solver.cpp:261] Train net output #0: loss = 0.747059 (* 1 = 0.747059 loss)
I1125 15:53:19.354431 2326 sgd_solver.cpp:106] Iteration 56, lr = 0.001
I1125 15:53:23.195466 2326 solver.cpp:242] Iteration 58 (0.520681 iter/s, 3.84112s/2 iter), loss = 0.677604
I1125 15:53:23.195549 2326 solver.cpp:261] Train net output #0: loss = 0.640145 (* 1 = 0.640145 loss)
I1125 15:53:23.195575 2326 sgd_solver.cpp:106] Iteration 58, lr = 0.001
I1125 15:53:25.140920 2326 solver.cpp:362] Iteration 60, Testing net (#0)
I1125 15:53:25.140965 2326 net.cpp:723] Ignoring source layer train-data
I1125 15:53:35.672775 2326 solver.cpp:429] Test net output #0: accuracy = 0.513333
I1125 15:53:35.672937 2326 solver.cpp:429] Test net output #1: loss = 0.69323 (* 1 = 0.69323 loss)
I1125 15:53:37.635395 2326 solver.cpp:242] Iteration 60 (0.138503 iter/s, 14.4401s/2 iter), loss = 0.655983
I1125 15:53:37.635478 2326 solver.cpp:261] Train net output #0: loss = 0.638368 (* 1 = 0.638368 loss)
I1125 15:53:37.635512 2326 sgd_solver.cpp:106] Iteration 60, lr = 0.001
I1125 15:53:41.458472 2326 solver.cpp:242] Iteration 62 (0.523143 iter/s, 3.82305s/2 iter), loss = 0.672996
I1125 15:53:41.458555 2326 solver.cpp:261] Train net output #0: loss = 0.753101 (* 1 = 0.753101 loss)
I1125 15:53:41.458588 2326 sgd_solver.cpp:106] Iteration 62, lr = 0.001
I1125 15:53:45.299643 2326 solver.cpp:242] Iteration 64 (0.520679 iter/s, 3.84114s/2 iter), loss = 0.668675
I1125 15:53:45.299737 2326 solver.cpp:261] Train net output #0: loss = 0.634894 (* 1 = 0.634894 loss)