I am using tf.keras
to build my network. And I am doing all the augmentation in tensor_wise level since my data in tfrecords
file. Then I needed to do shearing and zca for augmentation but couldn't find a proper implementation in tensor flow. And I can't use the DataImageGenerator
that did both operation I needed because as I said my data doesn't fit in memory and it is in tfrecord
format. So all my augmentations process should be tesnorwise.
@fchollet here suggested a way to use ImgaeDataGenerator
with large dataset.
My first questino is
if I use @fchollet way, which is basically using X-sample
of the large data to run the ImageDataGenerator
then using train_on_batch
to train the network , how I can feed my validation data to the network.
My Second question is there any tensor-wise implementation for shear and zca operations. Some people like here suggested using tf.contrib.image.transform
but couldn't understand how. If some one have the idea on how to do it, I will appreciate that.
Update:
This is my trial to construct the transformation matrix through ski_image
from skimage import io
from skimage import transform as trans
import tensor flow as tf
def augment()
afine_tf = trans.AffineTransform(shear=0.2)
transform = tf.contrib.image.matrices_to_flat_transforms(tf.linalg.inv(afine_tf.params))
transform= tf.cast(transform, tf.float32)
image = tf.contrib.image.transform(image, transform) # Image here is a tensor
return image
dataset_train = tf.data.TFRecordDataset(training_files, num_parallel_reads=calls)
dataset_train = dataset_train.apply(tf.contrib.data.shuffle_and_repeat(buffer_size=1000+ 4 * batch_size))
dataset_train = dataset_train.map(decode_train, num_parallel_calls= calls)
dataset_train = dataset_train.map(augment,num_parallel_calls=calls )
dataset_train = dataset_train.batch(batch_size)
dataset_train = dataset_train.prefetch(tf.contrib.data.AUTOTUNE)