tf.metrics.accuracy and hand-written accuracy function give different results

Question

I am trying to see how tf.metrics.accuracy works. I want to compare batch accuracy results of the function given below

with tf.name_scope('Accuracy1'):
        correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
        accuracy1 = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name="accuracy")

with

with tf.name_scope('Accuracy2'):
        accuracy2, accuracy_op = tf.metrics.accuracy(labels=tf.argmax(y, 1), predictions=tf.argmax(predictions, 1))

Minimal working example is provided below:

import numpy as np 
import pandas as pd 
import tensorflow as tf
import math

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

num_steps=28
num_inputs = 28
num_classes = 10
num_neurons = 128
num_layers = 3
batch_size = 500

graph = tf.Graph()
with graph.as_default():
    with tf.name_scope("graph_inputs"):
        X = tf.placeholder(tf.float32, [None, num_steps, num_inputs], name='input_placeholder')
        y = tf.placeholder(tf.float32, [None, num_classes], name='labels_placeholder')
       output_keep_prob = tf.placeholder_with_default(1.0, shape=(), name ="output_dropout")

def build_lstm_cell(num_neurons, output_keep_prob):
    """Returns a dropout-wrapped LSTM-cell.
    See https://stackoverflow.com/a/44882273/2628369 for why this local function is necessary.
    Returns:
    tf.contrib.rnn.DropoutWrapper: The dropout-wrapped LSTM cell.
    """
    initializer = tf.contrib.layers.xavier_initializer()
    lstm_cell = tf.contrib.rnn.LSTMCell(num_units=num_neurons, initializer=initializer, forget_bias=1.0, state_is_tuple=True, name='LSTM_cell')
    lstm_cell_drop = tf.contrib.rnn.DropoutWrapper(lstm_cell, output_keep_prob=output_keep_prob)
    return lstm_cell_drop

with tf.name_scope("LSTM"):
    with tf.name_scope("Cell"):
        multi_layer_cell = tf.contrib.rnn.MultiRNNCell([build_lstm_cell(num_neurons, output_keep_prob) for _ in range(num_layers)], state_is_tuple=True)
    with tf.name_scope("Model"):
        outputs, states = tf.nn.dynamic_rnn(cell=multi_layer_cell, inputs=X, swap_memory=False, time_major = False, dtype=tf.float32)#[Batch_size, time_steps, num_neurons]
    with tf.name_scope("Graph_Outputs"):
        outputs = tf.transpose(outputs, [1, 0, 2]) # [num_timesteps, batch_size, num_neurons]
        outputs = tf.gather(outputs, int(outputs.get_shape()[0]) - 1) # [batch_size, num_neurons]
    with tf.variable_scope('Softmax'):
        logits =  tf.layers.dense(inputs = outputs, units = num_classes, name="logits") #[Batch_size, num_classes]
    with tf.name_scope('Predictions'):
        predictions = tf.nn.softmax(logits, name="predictions")  #[Batch_size, num_classes]
    with tf.name_scope('Accuracy1'):
        correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
        accuracy1 = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name="accuracy")
    with tf.name_scope('Accuracy2'):
        accuracy2, accuracy_op = tf.metrics.accuracy(labels=tf.argmax(y, 1), predictions=tf.argmax(predictions, 1))
    with tf.name_scope('Loss'):
        xentropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=y)
        loss = tf.reduce_mean(xentropy, name="loss")
    with tf.name_scope('Train'):
        optimizer= tf.train.AdamOptimizer(learning_rate=0.0001)
        trainer=optimizer.minimize(loss, name="training_op")

with tf.Session(graph = graph) as sess:
    tf.global_variables_initializer().run()
    total_batch = mnist.train.num_examples // batch_size
    for batch in range(total_batch):
        tf.local_variables_initializer().run()
        xBatch, yBatch = mnist.train.next_batch(batch_size)
        xBatch = xBatch.reshape((batch_size, num_steps, num_inputs))
        sess.run(trainer, feed_dict={X: xBatch, y: yBatch, output_keep_prob: 0.5})
        miniBatchAccuracy1 = sess.run(accuracy1, feed_dict={X: xBatch, y: yBatch, output_keep_prob: 0.5})
        print('[hand-written] Batch {} accuracy: {}'.format(batch, miniBatchAccuracy1))
        accuracy_op_val = sess.run(accuracy_op, feed_dict={X: xBatch, y: yBatch, output_keep_prob: 0.5})
        miniBatchAccuracy2 = sess.run(accuracy2)
        print("[tf.metrics.accuracy] Batch {} accuracy: {}".format(batch, miniBatchAccuracy2))
    sess.close()

I print the accuracy values of each batches using these two approaches and they are different. Should not the results be the same?

[hand-written] Batch 0 accuracy: 0.09600000083446503
[tf.metrics.accuracy] Batch 0 accuracy: 0.09399999678134918

[hand-written] Batch 1 accuracy: 0.1120000034570694
[tf.metrics.accuracy] Batch 1 accuracy: 0.07800000160932541

[hand-written] Batch 2 accuracy: 0.10199999809265137
[tf.metrics.accuracy] Batch 2 accuracy: 0.09600000083446503

[hand-written] Batch 3 accuracy: 0.12999999523162842
[tf.metrics.accuracy] Batch 3 accuracy: 0.12800000607967377

[hand-written] Batch 4 accuracy: 0.1379999965429306
[tf.metrics.accuracy] Batch 4 accuracy: 0.10199999809265137

[hand-written] Batch 5 accuracy: 0.16200000047683716
[tf.metrics.accuracy] Batch 5 accuracy: 0.1340000033378601

[hand-written] Batch 6 accuracy: 0.1340000033378601
[tf.metrics.accuracy] Batch 6 accuracy: 0.12600000202655792

[hand-written] Batch 7 accuracy: 0.12999999523162842
[tf.metrics.accuracy] Batch 7 accuracy: 0.16200000047683716
...
...
...
...

Look these answers https://stackoverflow.com/a/46414395/5825953, https://stackoverflow.com/a/50746989/5825953 — Mitiku, Nov 12 '18 at 05:58

score 1 · Accepted Answer · answered Nov 12 '18 at 07:00

1

When measuring the accuracy for both cases, you are passing the dropout rate as 0.5. This is the reason its giving two different values. Set the dropout value at 1.0 and you should see similar values for both cases.

answered Nov 12 '18 at 07:00

Vijay Mariappan

16,921
3
40
59

Thank you, vijay! But i do not understand the reason why. Predictions and labels are the same for both functions. How does dropout affect it? – ARAT Nov 12 '18 at 14:47
1

Dropout is a random drop out of units and their connections so their predictions will be different even though they have the same dropout factor. Accuracy should be measured without the dropout (or setting it to 1) – Vijay Mariappan Nov 12 '18 at 16:17
It totally makes sense. Now I understood. Thanks again. So correct me if I am wrong: I do not call the `trainer` and `accuracy` in the same session.run(), is that correct? because `trainer` will be fed with `output_keep_prob: 0.5` and `accuracy` will be fed with `output_keep_prob: 1`. is it the same case for `loss`? Can I do `sess.run([trainer, loss], feed_dict = ={X: xBatch, y: yBatch, output_keep_prob: 0.5}` for batch training? – ARAT Nov 12 '18 at 16:40
1

yes you are right, you dont call trainer and accuracy in the same `sess.run()` when you want to measure the accuracy of the network. You can do a [trainer, loss] together to get the loss during training. – Vijay Mariappan Nov 12 '18 at 17:02
Thank you very much! You answers are much appreciated! – ARAT Nov 12 '18 at 18:46
@vijaym can you take a look at this question, https://stackoverflow.com/q/58926940/5904928 why there is huge difference between lstm final output vs state output. – Aaditya Ura Nov 19 '19 at 05:12

tf.metrics.accuracy and hand-written accuracy function give different results

1 Answers1