Printing extra training metrics with Tensorflow Estimator

Question

Is there a way to let Tensorflow print extra training metrics (e.g. batch accuracy) when using the Estimator API?

One can add summaries and view the result in Tensorboard (see another post), but I was wondering if there is an elegant way to get the scalar summary values printed while training. This already happens for training loss, e.g.:

loss = 0.672677, step = 2901 (52.995 sec)

but it would be nice to have e.g.

loss = 0.672677, accuracy = 0.54678, step = 2901 (52.995 sec)

without to much trouble. I am aware that most of the time it is more useful to plot test set accuracy (I am already doing this with a validation monitor), but in this case I am also interested in training batch accuracy.

score 26 · Accepted Answer · edited Jan 17 '18 at 06:12

26

From what I've read it is not possible to change it by passing parameter. You can try to do by creating a logging hook and passing it into to estimator run.

In the body of model_fn function for your estimator:

logging_hook = tf.train.LoggingTensorHook({"loss" : loss, 
    "accuracy" : accuracy}, every_n_iter=10)

# Rest of the function

return tf.estimator.EstimatorSpec(
    ...params...
    training_hooks = [logging_hook])

EDIT:

To see the output you must also set logging verbosity high enough (unless its your default): tf.logging.set_verbosity(tf.logging.INFO)

edited Jan 17 '18 at 06:12

Community

1
1

answered Aug 16 '17 at 14:18

Xyz

1,522
17
23

3

I needed to calculate the accuracy like this: `accuracy = tf.metrics.accuracy(labels=labels, predictions=predictions["classes"])` and then pass only the accuracy value (without the accuracy op) like this: `logging_hook = tf.train.LoggingTensorHook({"loss" : loss, "accuracy" : accuracy[1]}, every_n_iter=10)` in order to make it work. – tsveti_iko Jul 30 '18 at 12:11
Note that you need to pass the whole accuracy (as a tuple) like this: `eval_metric_ops = { "accuracy": accuracy }` `return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)` – tsveti_iko Jul 30 '18 at 13:31
I think it's "accuracy[0]" instead of "accuracy[1]" – fengyun Feb 13 '19 at 09:06
I'm trying this solution, but I actually get a weird error: "Passed should have graph attribute that is equal to current graph". What does it mean? – Andrea Rossi Feb 20 '19 at 09:23
Check if your tensors come from the graph part defined in model_fn (not somewhere outside this function) – Xyz Feb 22 '19 at 13:38
incomplete, are loss and accuracy tf.summary things or just tensors? or values? – mathtick Apr 02 '19 at 10:32
!tsveti_iko do we have an example of doing this using the TrainSpec .. – E B Dec 13 '19 at 21:06

tsveti_iko · Answer 2 · 2019-08-02T09:48:22.397

6

You can also use the TensorBoard to see some graphics of the desired metrics. To do that, add the metric to a TensorFlow summary like this:

accuracy = tf.metrics.accuracy(labels=labels, predictions=predictions["classes"])
tf.summary.scalar('accuracy', accuracy[1])

The cool thing when you use the tf.estimator.Estimator is that you don't need to add the summaries to a FileWriter, since it's done automatically (merging and saving them periodically by default - on average every 100 steps).

Don't forget to change this line as well, based on the accuracy parameter you just added:

eval_metric_ops = { "accuracy": accuracy }
return tf.estimator.EstimatorSpec(
    mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

In order to see the TensorBoard you need to open a new terminal and type:

tensorboard --logdir={$MODEL_DIR}

After that you will be able to see the graphics in your browser at localhost:6006.

edited Aug 02 '19 at 09:48

answered Jul 30 '18 at 13:14

tsveti_iko

6,834
3
47
39

1

It's not a direct answer, but I think this works better than filling your terminal with info. In my opinion, Tensorboard is much better for watching metrics than a terminal – ricoms Aug 13 '18 at 23:01
where is this documented? "merged them and saving the 100 steps default" – mathtick Apr 02 '19 at 10:26
How to do this for generic tensors in train? Is "eval_metric_ops" eval as in *not* train or is some other semantic meaning of "eval"? – mathtick Apr 02 '19 at 10:35
What does "eval_metric_ops" do? – Random Certainty Aug 01 '19 at 15:41
The `eval_metric_ops` is an optional dictionary of metrics, that you can provide to your custom Estimator, so that these metrics will be used in the evaluation - check the documentation here: https://www.tensorflow.org/guide/custom_estimators#evaluate – tsveti_iko Aug 02 '19 at 09:02

Printing extra training metrics with Tensorflow Estimator

2 Answers2

Linked