13

I had built a convolutional neural network in tensorflow. It is trained and now I am unpacking it and performing evaluations.

import main
import Process
import Input

eval_dir = "/Users/Zanhuang/Desktop/NNP/model.ckpt-250"
checkpoint_dir = "/Users/Zanhuang/Desktop/NNP/checkpoint"

def evaluate():
  with tf.Graph().as_default() as g:
    images, labels = Process.eval_inputs()
    forward_propgation_results = Process.forward_propagation(images)
    init_op = tf.initialize_all_variables()
    saver = tf.train.Saver()
    top_k_op = tf.nn.in_top_k(forward_propgation_results, labels, 1)

  with tf.Session(graph=g) as sess:
    tf.train.start_queue_runners(sess=sess)
    sess.run(init_op)
    saver.restore(sess, eval_dir)
    print(sess.run(top_k_op))


def main(argv=None):
    evaluate()

if __name__ == '__main__':
  tf.app.run()

Unfortunately a strange error has popped up and I have no clue why.

W tensorflow/core/kernels/queue_base.cc:2
W tensorflow/core/kernels/queue_base.cc:294] _0_input_producer: Skipping cancelled enqueue attempt with queue not closed
W tensorflow/core/kernels/queue_base.cc:294] _1_batch/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
     [[Node: batch/fifo_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], _class=["loc:@batch/fifo_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, Cast_1, Cast)]]
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
     [[Node: batch/fifo_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], _class=["loc:@batch/fifo_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, Cast_1, Cast)]]
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
     ....
     [[Node: batch/fifo_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], _class=["loc:@batch/fifo_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, Cast_1, Cast)]]
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled
     [[Node: batch/fifo_queue_enqueue = QueueEnqueue[Tcomponents=[DT_FLOAT, DT_INT32], _class=["loc:@batch/fifo_queue"], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](batch/fifo_queue, Cast_1, Cast)]]
W tensorflow/core/kernels/queue_base.cc:294] _1_batch/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
...
W tensorflow/core/kernels/queue_base.cc:294] _1_batch/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
E tensorflow/core/client/tensor_c_api.cc:485] Enqueue operation was cancelled

This is only a part of it.

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
Zan Huang
  • 663
  • 4
  • 11
  • 19
  • 1
    That's a harmless info message that's been removed in newer versions of tensorflow – Yaroslav Bulatov Jul 30 '16 at 21:00
  • 1
    I have Tensorflow 9.0. Which is the latest version, and the nightly build. Also, it won't let me continue running the program. – Zan Huang Jul 30 '16 at 21:29
  • It's not clear that this message indicates an error. I would ignore it and try to debug the actual issue. IE, are the queue runners enqueueing anything on the queues (look at queue.size() after starting them)? – Yaroslav Bulatov Jul 30 '16 at 23:21
  • Well I will copy and paste the whole error to give you a clear response. I don't think they are managing to do so correctly. – Zan Huang Jul 30 '16 at 23:37
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/118722/discussion-between-zan-huang-and-yaroslav-bulatov). – Zan Huang Jul 31 '16 at 01:07

3 Answers3

24

Update from chat -- the program runs successfully, and the messages that are printed are due to Python killing threads while they are running as the process exits.

The messages are harmless but it's possible to avoid them by stopping threads manually using pattern below.

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
<do stuff>
coord.request_stop()
coord.join(threads)
Yaroslav Bulatov
  • 57,332
  • 22
  • 139
  • 197
0

Everything works correctly and the problem happens at the very last stage when python tries to kill threads. To do this properly you should create a train.Coordinator and pass it to your queue_runner (no need to pass sess, as default session will be used

with tf.Session() as sess:
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    // do your things
    coord.request_stop()
    coord.join(threads)
Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
0

a way to add coord with exception handling:

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess, coord)

try:
    while not coord.should_stop():

        #doing things here

except tf.errors.OutOfRangeError:
    print("things done")

finally:
    coord.request_stop()

then bug fixed :-)

gromit
  • 1