Does the Tensorflow Dataset API include Queue functionality?

Question

The Tensorflow Queues offered the advantage that data could be fetched and queued independently of the rest of the graph, allowing CPU/disk to pre-fetch data so that the GPUs don't run dry.

I've read in a blog that with the Dataset API, this is missing again. However the dataset shuffle() function allows a buffer_size which I would assume enables a buffer-queue? Is this the same as combining a Dataset API and Queue (see code below)? Is there a recommended way to create a proper, indipendent data-fetching queue?

Code example for Dataset API + Queue:

sample_set = tf.data.Dataset.from_generator(...)
sample = sample_set.make_one_shot_iterator().get_next()
sample_batch = tf.train.shuffle_batch([sample], batch_size=10,
                                       capacity=30, num_threads=1, 
                                       min_after_dequeue=1)

... is this the same as in pure Dataset API? (How can I define the number of threads here?)

sample_set = tf.data.Dataset.from_generator(...)
sample_set = sample_set.shuffle(buffer_size=30)
sample_set = sample_set.batch(10)
sample = sample_set.make_one_shot_iterator().get_next()

Check here it should hopefully answer: https://stackoverflow.com/questions/46444018/meaning-of-buffer-size-in-dataset-map-dataset-prefetch-and-dataset-shuffle Short answer is yes using Data api you still have preloading in the background while procesing on GPU — Burton2000, Mar 20 '18 at 16:29
Thanks, the "duplicate" and your link provide enough insights to give me all I need. `Dataset.prefetch()` and the `num_parallel_calls` of `Dataset.map()` basically offer everything for multi-threaded prefetching, making Queues obsolete (and then obviously combining Dataset API + Queues is a bad idea) — Honeybear, Mar 20 '18 at 17:20

Does the Tensorflow Dataset API include Queue functionality?

0 Answers0