I am trying to do different kinds of (image) data augmentation for training my neural network.
I know that tf.image offers some augmentation functions, but they are too simple - for example, I can only rotate the image by 90 degree, instead of any degree.
I also know that tf.keras.preprocessing.image offers random rotation, random shear, random shift and random zoom. However these methods can only be applied on numpy array, instead of tensor.
I know I can read the images first, use functions from tf.keras.preprocessing.image to do the augmentation, and then convert these augmented numpy arrays to tensors.
However, I just wonder whether there is a way that I can implement tensor-wise augmentations, so that I don't need to bother with the "image file -> tensor -> numpy array -> tensor" procedure.
Update for those who want to know how to apply your transform:
For detailed source code, you may want to check tf.contrib.image.transform and tf.contrib.image.matrices_to_flat_transforms.
here is my code:
def transformImg(imgIn,forward_transform):
t = tf.contrib.image.matrices_to_flat_transforms(tf.linalg.inv(forward_transform))
# please notice that forward_transform must be a float matrix,
# e.g. [[2.0,0,0],[0,1.0,0],[0,0,1]] will work
# but [[2,0,0],[0,1,0],[0,0,1]] will not
imgOut = tf.contrib.image.transform(imgIn, t, interpolation="BILINEAR",name=None)
return imgOut
Basically, the code above is doing
for every point (x,y) in
imgIn
.
A shear transform parallel to the x axis, for example , is
Therefore, we can implement shear transform like this (using transformImg()
defined above):
def shear_transform_example(filename,shear_lambda):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
img = transformImg(image_decoded, [[1.0,shear_lambda,0],[0,1.0,0],[0,0,1.0]])
return img
img = shear_transform_example("white_square.jpg",0.1)
(Please notice that img
is a tensor, codes to convert tensors to image files are not included.)
P.S.
The above codes work on tensorflow 1.10.1, and might not work on future versions.
To be honest, I really don't know why they designed tf.contrib.image.transform in a way that we have to use another function(tf.linalg.inv) to get what we want. I really hope they can change tf.contrib.image.transform to work in a more intuitive way.