Suppose, I have a huge list of objects, each of which could be a list of numpy arrays, for example.
What’s the best way to pass this dataset to tensorflow?
I want to be able to randomly shuffle the data and form batches. May be it’s worth to shuffle the dataset and form batches using standard python(numpy) procedures and after that use something like tf.data.Dataset.from_generator()
?
Straightforward approach of transforming full dataset to tf.Tensor
seems to be useless due to size limit for the tf.GraphDef
protocol buffer(according to the Tensorflow documentation).