0

I'm training a neural network using Caffe. In the solver.prototxt file, I can set average_loss to print the loss averaged over last N iterations. Is it possible to do so using other values as well ?

For example, I wrote a custom PythonLayer outputting accuracy, and I would like to display the average accuracy over the last N iterations as well.

Thanks,

EDIT: here is the log. The DEBUG lines show the accuracy computed at each image, and every 3 images (average_loss: 3 and display: 3), the accuracy is displayed with the loss. We see that only the last one is displayed, what I want is the average of the 3).

2018-04-24 10:38:06,383 [DEBUG]: Accuracy: 0 / 524288 = 0.000000
I0424 10:38:07.517436 99964 solver.cpp:251] Iteration 0, loss = 1.84883e+06
I0424 10:38:07.517503 99964 solver.cpp:267]     Train net output #0: accuracy = 0
I0424 10:38:07.517521 99964 solver.cpp:267]     Train net output #1: loss = 1.84883e+06 (* 1 = 1.84883e+06 loss)
I0424 10:38:07.517536 99964 sgd_solver.cpp:106] Iteration 0, lr = 2e-12
I0424 10:38:07.524904 99964 solver.cpp:287]     Time: 2.44301s/1iters
2018-04-24 10:38:08,653 [DEBUG]: Accuracy: 28569 / 524288 = 0.054491
2018-04-24 10:38:11,010 [DEBUG]: Accuracy: 22219 / 524288 = 0.042379
2018-04-24 10:38:13,326 [DEBUG]: Accuracy: 168424 / 524288 = 0.321243
I0424 10:38:14.533329 99964 solver.cpp:251] Iteration 3, loss = 1.84855e+06
I0424 10:38:14.533406 99964 solver.cpp:267]     Train net output #0: accuracy = 0.321243
I0424 10:38:14.533426 99964 solver.cpp:267]     Train net output #1: loss = 1.84833e+06 (* 1 = 1.84833e+06 loss)
I0424 10:38:14.533440 99964 sgd_solver.cpp:106] Iteration 3, lr = 2e-12
I0424 10:38:14.534195 99964 solver.cpp:287]     Time: 7.01088s/3iters
2018-04-24 10:38:15,665 [DEBUG]: Accuracy: 219089 / 524288 = 0.417879
2018-04-24 10:38:17,943 [DEBUG]: Accuracy: 202896 / 524288 = 0.386993
2018-04-24 10:38:20,210 [DEBUG]: Accuracy: 0 / 524288 = 0.000000
I0424 10:38:21.393121 99964 solver.cpp:251] Iteration 6, loss = 1.84769e+06
I0424 10:38:21.393190 99964 solver.cpp:267]     Train net output #0: accuracy = 0
I0424 10:38:21.393210 99964 solver.cpp:267]     Train net output #1: loss = 1.84816e+06 (* 1 = 1.84816e+06 loss)
I0424 10:38:21.393224 99964 sgd_solver.cpp:106] Iteration 6, lr = 2e-12
I0424 10:38:21.393940 99964 solver.cpp:287]     Time: 6.85962s/3iters
2018-04-24 10:38:22,529 [DEBUG]: Accuracy: 161180 / 524288 = 0.307426
2018-04-24 10:38:24,801 [DEBUG]: Accuracy: 178021 / 524288 = 0.339548
2018-04-24 10:38:27,090 [DEBUG]: Accuracy: 208571 / 524288 = 0.397818
I0424 10:38:28.297776 99964 solver.cpp:251] Iteration 9, loss = 1.84482e+06
I0424 10:38:28.297843 99964 solver.cpp:267]     Train net output #0: accuracy = 0.397818
I0424 10:38:28.297863 99964 solver.cpp:267]     Train net output #1: loss = 1.84361e+06 (* 1 = 1.84361e+06 loss)
I0424 10:38:28.297878 99964 sgd_solver.cpp:106] Iteration 9, lr = 2e-12
I0424 10:38:28.298607 99964 solver.cpp:287]     Time: 6.9049s/3iters
I0424 10:38:28.331749 99964 solver.cpp:506] Snapshotting to binary proto file snapshot/train_iter_10.caffemodel
I0424 10:38:36.171842 99964 sgd_solver.cpp:273] Snapshotting solver state to binary proto file snapshot/train_iter_10.solverstate
I0424 10:38:43.068686 99964 solver.cpp:362] Optimization Done.
MeanStreet
  • 1,217
  • 1
  • 15
  • 33

1 Answers1

1

Caffe only averages over average_loss iteration the global loss of the net (the weighted sum of all loss layers) while reporting the output of only the last batch for all other output blobs.

Therefore, if you want your Python layer to report accuracy averaged over several iterations, I suggest you store a buffer SS a member of your layer class and display this aggregated value.
Alternatively, you can implement a "moving average" on top of the accuracy calculation and output this value as a "top".

You can have a "moving average output layer" implemented in python. This layer can take any number of "bottoms" and output the moving average of these bottoms.

Python code of layer:

import caffe
class MovingAverageLayer(caffe.Layer):
  def setup(self, bottom, top):
    assert len(bottom) == len(top), "layer must have same number of inputs and outputs"
    # average over how many iterations? read from param_str
    self.buf_size = int(self.param_str)
    # allocate a buffer for each "bottom"
    self.buf = [[] for _ in self.bottom]

  def reshape(self, bottom, top):
    # make sure inputs and outputs have the same size
    for i, b in enumerate(bottom):
      top[i].reshape(*b.shape)

  def forward(self, bottom, top):
    # put into buffers
    for i, b in enumerate(bottom):
      self.buf[i].append(b.data.copy())
      if len(self.buf[i]) > self.buf_size:
        self.buf[i].pop(0)
      # compute average
      a = 0
      for elem in self.buf[i]:
        a += elem
      top[i].data[...] = a / len(self.buf[i])

  def backward(self, top, propagate_down, bottom):
    # this layer does not back prop
    pass

How to use this layer in prototxt:

layer {
  name: "moving_ave"
  type: "Python"
  bottom: "accuracy"
  top: "av_accuracy"
  python_param {
    layer: "MovingAverageLayer"
    module: "path.to.module"
    param_str: "30"  # buf size 
  }
}

See this tutorial for more information.


Original incorrect answer:
Caffe outputs to log whatever the net outputs: loss, accuracy or any other blob that appears as "top" of a layer and is not used as a "bottom" in any other layer.
Therefore, if you want to see accuracy computed by a "Python" layer, simply make sure no other layer uses this accuracy as an input.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • Right, but only the last accuracy computed is displayed. If I set display to 100, I'll see the 99th accuracy. What I want is to output the average accuracy over the 100 elements – MeanStreet Apr 23 '18 at 17:30
  • 1
    @Mean-Street AFAIK all values displayed (loss, accuracy or other) are averaged over the last `average_loss` iterations – Shai Apr 23 '18 at 17:43
  • Let me double check the log at work tomorrow, I'll let you know. Thanks for your help anyway – MeanStreet Apr 23 '18 at 17:48
  • I checked and the accuracy is not averaged: I edited the question with the log. I can also show the accuracy PythonLayer code if needed – MeanStreet Apr 24 '18 at 08:55
  • @Mean-Street ok I didn't know that, thank you for pointing this out for me – Shai Apr 24 '18 at 20:10
  • You're welcome. Do you know if it is possible ? I could hack my way in the code of the layer, storing the N last computed iterations and returning the average, but I was looking for a more elegant solution – MeanStreet Apr 25 '18 at 07:19
  • @Mean-Street I can't think of an elegant way: caffe does not seem to do this for you (apart from the global loss). But it's not exactly a "hack" to add it to your python layer code, it's quite simple and straight forward. – Shai Apr 25 '18 at 07:21
  • 1
    Nice thanks. In the meantime, I came up with a similar implementation. This [question](https://stackoverflow.com/questions/1296511/efficiency-of-using-a-python-list-as-a-queue) highlights that using `deque` is faster than insert/pop in a list (even though it's probably neglectable here) – MeanStreet Apr 25 '18 at 07:48