Images feeding to deep neural network

Question

Let's say you have a folder where your dataset is ( a lot of images ). You want to feed these images to a Deep neural network for training ( for me I use Tensorflow for now).

The first solution that comes to the mind ( very unclassy and beginner solution ) is to store the images in an array. This is Ok for a small dataset but when the dataset is big and pictures are big this is not a viable solution because you we'll not have enough memory.

The solution is to read data in batches.

I m trying to implement that. The dataset I m interested in is the cultech's Caltech-UCSD Birds 200. The dataset is provide with a text file in which each line contains the path to each image. This facilitates things. My solution ( That I m trying to implement ) is to define a class. The template is :

class Dataset : 
          attributes : 
                  images_paths
                  labels 
                  current_batch_index
                  batch_size
                  classes_names
          methodes : 
                  get_next_batch() 
                  shuffle()
                  normalize()

As soon as I instanciate an object of this class, the paths to all images are stored in the variable images_paths and the ground_truth labels are stored in the labels ( one_hot_encoded). The method get_next_batch() will use the current_batch_index to return ana array where the we store the actual images using the paths. The size of the array is the batch_size and the index read from the images_path and labes are ( current_batch_index,current_batch_index+batch_size). ( I read images using scipy.misc.imread and reshape them to a fixed shape ( 200x200 ) usig scipy.misc.reshape ).

This way I ll use the object to store only a batch in memory and use it in the training loop to feed it to the network.

Questions : what do you think of this ? how do you feed your images to the netword normally ? Are there tools to that ? are there tools to split your dataset?

F.Y.I : I m using python and tensorflow. Would be interesting to know the answers for these questions for C++ too.

THank you and sorry for the long post

score 1 · Accepted Answer · answered Aug 02 '18 at 10:27

1

Tensorflow allows reading the data from disk in batches as needed, and has methods for buffering the data ahead of time to reduce latency (viz. whilst batch 3 is running through the network, batch 4 is sitting in memory and batch 5 is being loaded into memory where batch 2 used to be.) Check out the tf.data library. The cifar10 example does something like you are asking, but cifar10 is in a weird format, so some adjustment is in order.

Anybody have a better example?

answered Aug 02 '18 at 10:27

Him

5,257
3
26
83

To add exotic ingredients to the soup of "How to read data for TF". I suggest having a look at [bullet-proof LMDBs](https://monero.stackexchange.com/questions/702/why-did-monero-choose-lmdb-over-alternative-database-types/784) with all [its advantages](http://tensorpack.readthedocs.io/tutorial/input-source.html#python-reader-or-tf-reader) besides offering [high-performance](https://stackoverflow.com/questions/35279756/what-is-special-about-internal-design-of-lmdb). A disadvantage of `tfrecord` is the missing feature of random-read (yes, you could fake it by sharding and local shuffle). – Patwie Aug 02 '18 at 11:01
Thank you, I m reading documentation of tf.data. It appeares that I have to create a tfrecord of my dataset. Is it too much work for this small dataset (arround 12000 images) ? – Volta Aug 02 '18 at 20:20
@Volta, it's not *that* much work. I have insufficient information of your exact circumstances to make a judgment call, though. – Him Aug 03 '18 at 05:04

Images feeding to deep neural network

1 Answers1