1

Question

I tried to create a CNN where I am using images as labels, with values between 0 and 1. After some training my net has a loss of round about 23. Now I want to see the results. For this purpose I am using this python script:

import caffe
import numpy as np
from PIL import Image

net = caffe.Net('D:/caffe/net.prototxt',
            'D:/caffe/net_iter_35000.caffemodel',
            caffe.TEST)

# load input and configure preprocessing
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

transformer.set_mean('data', np.load('train_mean.npy').mean(1).mean(1))
transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)

#note we can change the batch size on-the-fly 
#since we classify only one image, we change batch size from 10 to 1
net.blobs['data'].reshape(1,3,360,360)

#load the image in the data layer
im = caffe.io.load_image('train/img0.png')
net.blobs['data'].data[...] = transformer.preprocess('data', im)

#compute
out = net.forward()

result = out['conv7'][0][0]

Now I am expecting the values of result to approximatly be between 0 and 1. But in reality result.max() returns 5.92 and result.min() returns -4315.5.

Is there a mistake in the python script or are this values normal for a loss of 23?


Additional Infos

My train_test.prototxt:

layer {
  name: "mynet"
  type: "Data"
  top: "data0"
  top: "label0"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "train_mean.binaryproto"
    scale: 0.00390625
  }
  data_param {
    source: "train_lmdb"
    batch_size: 32
    backend: LMDB
  }
}

layer {
  name: "mynetlabel"
  type: "Data"
  top: "data1"
  top: "label1"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "train_label_lmdb_2"
    batch_size: 32
    backend: LMDB
  }
}

layer {
  name: "mnist"
  type: "Data"
  top: "data0"
  top: "label0"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "train_mean.binaryproto"
    scale: 0.00390625
  }
  data_param {
    source: "val_lmdb"
    batch_size: 16
    backend: LMDB
  }
}
layer {
  name: "mnistlabel"
  type: "Data"
  top: "data1"
  top: "label1"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "val_label_lmdb_2"
    batch_size: 16
    backend: LMDB
  }
}
.
. 
.
layer {
  name: "conv7"
  type: "Convolution"
  bottom: "conv6"
  top: "conv7"
  param {
    lr_mult: 5.0
    decay_mult: 1.0
  }
  param {
    lr_mult: 10.0
    decay_mult: 0.0
  }
  convolution_param {
    num_output: 1
    pad: 0
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      std: 0.00999999977648
    }
    bias_filler {
      type: "constant"
    }
  }
}

layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "conv7"
  bottom: "data1"
  top: "accuracy"
  include {
    phase: TEST
  }
}

layer {
  name: "loss"
  type: "SigmoidCrossEntropyLoss"
  bottom: "conv7"
  bottom: "data1"
  top: "loss"
}

My net.prototxt:

layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 50 dim: 3 dim: 360 dim: 360 } }
  transform_param {
    scale: 0.00390625
  }
}
.
.
.
layer {
  name: "conv7"
  type: "Convolution"
  bottom: "conv6"
  top: "conv7"
  param {
    lr_mult: 5.0
    decay_mult: 1.0
  }
  param {
    lr_mult: 10.0
    decay_mult: 0.0
  }
  convolution_param {
    num_output: 1
    pad: 0
    kernel_size: 1
    weight_filler {
      type: "gaussian"
      std: 0.00999999977648
    }
    bias_filler {
      type: "constant"
    }
  }
}
Shai
  • 111,146
  • 38
  • 238
  • 371
SimpleNotGood
  • 345
  • 3
  • 14

1 Answers1

2

Your train_val.prototxt uses "SigmoidWithCrossEntropy", as the name of this layer suggests, it comprises (internally) of a "Sigmoid" layer and a cross entropy loss. Therefore, when deploying your net you should replace this layer with a "Sigmoid" layer in your net.prototxt file.
See this answer for more details.

PS,
Using "Accuracy" layer for single binary output is not supported in caffe: "Accuracy" layer assumes the dimension of your prediction equals number of classes (good for "SoftmaxWithLoss"). In your case you have two labels {0, 1} but dim of output is only 1. See this answer for more details.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • First of all Shai you are awesome. Regarding your PS: I am using an image as label and multiple positions in this image can be values between zero and one (not just binary zero and one). So I am guessing this is not exactly what you are referring to. Do you suggest I use SoftmaxWithLoss instead of SigmoidWithCrossEntropy? Or am I not understanding your answer correctly? – SimpleNotGood Dec 04 '17 at 14:41
  • @SimpleNotGood Regarding `"Accuracy"`: you cannot use this layer to measure accuracy of single output predictions. Caffe's `"Accuracy"` layer is designed to measure accuracy for classification task only. To measure accuracy of your **continuous**, single dim, predictions you need to use a different method. – Shai Dec 04 '17 at 14:43
  • @SimpleNotGood if your prediction is a continuous value in range [0,1] then it does not seem like you need to use `"Softmax"` instead of `"Sigmoid"`. – Shai Dec 04 '17 at 14:44
  • Regarding "Accuracy": Isn't my output image a multi output prediction? Because I can see every pixel of my image as one output. Which means I have a multi dim prediction. So, where is my misconception? – SimpleNotGood Dec 04 '17 at 15:11
  • 1
    @SimpleNotGood you have a 1dim prediction per pixel. For instance, in semantic segmentation with 80 classes, you have 80dim prediction **per pixel**. The multi-dim prediction refers to the channel dimension, – Shai Dec 04 '17 at 15:13
  • Now I get it, thank you. What do you suggest instead? Just deleting the "Accuracy" layer? – SimpleNotGood Dec 04 '17 at 15:19
  • 1
    @SimpleNotGood suppose so – Shai Dec 04 '17 at 15:22