When using the following code to train my network:
classifier = tf.estimator.Estimator(
model_fn=my_neural_network_model,
model_dir=some_path_to_save_checkpoints,
params={
some_parameters
}
)
classifier.train(input_fn=data_train_estimator, steps=step_num)
where data_train_estimator is defined as:
def data_train_estimator():
dataset = tf.data.TextLineDataset(train_csv_file).map(_parse_csv_train)
dataset = dataset.batch(100)
dataset = dataset.shuffle(1000)
dataset = dataset.repeat()
iterator = dataset.make_one_shot_iterator()
feature, label = iterator.get_next()
return feature, label
How does dataset.shuffle(1000) actually work?
More specifically,
Let's say I have 20000 images, batch size = 100, shuffle buffer size = 1000, and I train the model for 5000 steps.
1. For every 1000 steps, am I using 10 batches(of size 100), each independently taken from the same 1000 images in the shuffle buffer?
2.1 Does the shuffle buffer work like a moving window?
2.2 Or, does it randomly pick 1000 out of the 5000 images (with or without replacement)?
3. In the whole 5000 steps, how many different states has the shuffle buffer been in?