26

The Keras documentation of ImageDataGenerator class says—

width_shift_range: Float, 1-D array-like or int - float: fraction of total width, if < 1, or pixels if >= 1. - 1-D array-like: random elements from the array. - int: integer number of pixels from interval (-width_shift_range, +width_shift_range) - With width_shift_range=2 possible values are integers [-1, 0, +1], same as with width_shift_range=[-1, 0, +1], while with width_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).

height_shift_range: Float, 1-D array-like or int - float: fraction of total height, if < 1, or pixels if >= 1. - 1-D array-like: random elements from the array. - int: integer number of pixels from interval (-height_shift_range, +height_shift_range) - With height_shift_range=2 possible values are integers [-1, 0, +1], same as with height_shift_range=[-1, 0, +1], while with height_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).

I’m new in Keras and machine learning, and I just have started learning it.

I am struggling to understand the documentation and use of these two arguments of Keras ImageDataGenerator class, named width_shift_range and height_shift_range. I have searched out a lot, but couldn't find any good documentation other than the official. What exactly do these two arguments do? When have to use them?

This talk may seem inappropriate here, but since there is no discussion anywhere on the internet, I think it would be nice to have the discussion here.

If anyone helps me understanding these, I would be grateful. Thank you very much.

Arafat Hasan
  • 2,811
  • 3
  • 21
  • 38

1 Answers1

31

These two argument used by ImageDataGenerator class Which use to preprocess image before feeding it into network. If you want to make your model more robust then small amount of data is not enough. That is where data augmentation come in handy. This are used to generate random data.

width_shift_range: It actually shift the image to the left or right(horizontal shifts). If the value is float and <=1 it will take the percentage of total width as range. Suppose image width is 100px. if width_shift_range = 1.0 it will take -100% to +100% means -100px to +100px. It will shift image randomly between this range. Randomly selected positive value will shift the image to the right side and negative value will shift the image to the left side. We can also do this by selecting pixels. if we set width_shift_range = 100 it will have the same effect. More importantly integer value>=1 count pixel as range and float value<=1 count percentage of total width as range. Below images are for width_shift_range = 1.0.

For value 1

height_shift_range: It works same as width_shift_range but shift vertically(up or down). Below images are for height_shift_range=0.2,fill_mode="constant"

enter image description here

fill_mode: It sets rules for newly shifted pixel in the input area.

## fill_mode: One of {"constant", "nearest", "reflect" or "wrap"}. 
## Points outside the boundaries of the input are filled according to the given mode:
## "constant": kkkkkkkk|abcd|kkkkkkkk (cval=k)
## "nearest":  aaaaaaaa|abcd|dddddddd
## "reflect":  abcddcba|abcd|dcbaabcd
## "wrap":  abcdabcd|abcd|abcdabcd

For more you can check this blog

Sayed Sohan
  • 1,385
  • 16
  • 23