2

So I've been play around with this code: https://www.tensorflow.org/tutorials/generative/dcgan and have almost developed a good idea about its functioning. However, I can't quite discover what is the BUFFER_SIZE variable's use. I suspect that it may be used to create a subset of the database of size BUFFER_SIZE and then the batches are taken from this subset, but I don't see the point on it and neither can find someone explaining it.

So, if someone could explain me what BUFFER_SIZE does, I would be thankful ❤

Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
Jorvan
  • 109
  • 2
  • 13
  • 1
    Does this answer your question? [Meaning of buffer\_size in Dataset.map , Dataset.prefetch and Dataset.shuffle](https://stackoverflow.com/questions/46444018/meaning-of-buffer-size-in-dataset-map-dataset-prefetch-and-dataset-shuffle) – Slim Shady Nov 25 '21 at 09:34

2 Answers2

2

It's used as the buffer_size argument in tf.data.Dataset.shuffle. Have you read the docs?

This dataset fills a buffer with buffer_size elements, then randomly samples elements from this buffer, replacing the selected elements with new elements. For perfect shuffling, a buffer size greater than or equal to the full size of the dataset is required.

For instance, if your dataset contains 10,000 elements but buffer_size is set to 1,000, then shuffle will initially select a random element from only the first 1,000 elements in the buffer. Once an element is selected, its space in the buffer is replaced by the next (i.e. 1,001-st) element, maintaining the 1,000 element buffer.

Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
0

In the documentation of TensorFlow, the buffer_size define a random first element between the size of buffer_size. After choose this random one, the next numbers will follow the size of buffer_size

samples = 1000
buffer_size = 100

choose a random between (0, 100)
random = 37
the sample will be (37 to 137)