I have an imbalanced and small dataset which contains 4116 224x224x3 (RGB) aerial images. It's very likely that I will encounter the overfitting problem since the dataset is not big enough. Image preprocessing and data augmentation help to tackle this problem as explained below.
"Overfitting is caused by having too few samples to learn from, rendering you unable to train a model that can generalize to new data. Given infinite data, your model would be exposed to every possible aspect of the data distribution at hand: you would never overfit. Data augmentation takes the approach of generating more training data from existing training samples, by augmenting the samples via a number of random transformations that yield believable-looking images."
Deep Learning with Python by François Chollet, page 138-139, 5.2.5 Using data augmentation.
I've read Medium - Image Data Preprocessing for Neural Networks and examined Stanford's CS230 - Data Preprocessing and CS231 - Data Preprocessing courses. It is highlighted once more in SO question and I understand that there is no "one fits all" solution. Here is what forced me to ask this question:
"No translation augmentation was used since we want to achieve high spatial resolution."
I know that I will use Keras - ImageDataGenerator Class, but don't know which techniques and what parameters to use for the semantic segmentation on small objects task. Could someone enlighten me? Thanks in advance. :)
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
rotation_range=20, # is a value in degrees (0–180)
width_shift_range=0.2, # is a range within which to randomly translate pictures horizontally.
height_shift_range=0.2, # is a range within which to randomly translate pictures vertically.
shear_range=0.2, # is for randomly applying shearing transformations.
zoom_range=0.2, # is for randomly zooming inside pictures.
horizontal_flip=True, # is for randomly flipping half the images horizontally
fill_mode='nearest', # is the strategy used for filling in newly created pixels, which can appear after a rotation or a width/height shift
featurewise_center=True,
featurewise_std_normalization=True)
datagen.fit(X_train)