I'm trying to apply the process of data augmentation to a database. I use the following code:
train_generator = keras.utils.image_dataset_from_directory(
directory= train_dir,
subset = "training",
image_size = (50,50),
batch_size = 32,
validation_split = 0.3,
seed = 1337,
labels = "inferred",
label_mode = 'binary'
)
validation_generator = keras.utils.image_dataset_from_directory(
subset="validation",
directory=validation_dir,
image_size=(50,50),
batch_size =40,
seed=1337,
validation_split = 0.3,
labels = "inferred",
label_mode ='binary'
)
data_augmentation = keras.Sequential([
keras.layers.RandomFlip("horizontal"),
keras.layers.RandomRotation(0.1),
keras.layers.RandomZoom(0.1),
])
train_dataset = train_generator.map(lambda x, y: (data_augmentation(x, training=True), y))
But when I try to run the training processe using this method, I get a "insuficient data" warning:
6/100 [>.............................] - ETA: 21s - loss: 0.7602 - accuracy: 0.5200WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 2000 batches). You may need to use the repeat() function when building your dataset.
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 10 batches). You may need to use the repeat() function when building your dataset.
Yes, the original dataset is insuficient, but the data augmentation should provide more than enough data for the training. Does anyone know what's going on ?
EDIT:
fit call:
history = model.fit(
train_dataset,
epochs = 20,
steps_per_epoch = 100,
validation_data = validation_generator,
validation_steps = 10,
callbacks=callbacks_list)
This is the version I have using DataImageGenerator:
train_datagen = keras.preprocessing.image.ImageDataGenerator(rescale =1/255,rotation_range = 40,width_shift_range = 0.2,height_shift_range = 0.2,shear_range = 0.2,zoom_range = 0.2,horizontal_flip = True)
train_generator = train_datagen.flow_from_directory(directory= train_dir,target_size = (50,50),batch_size = 32,class_mode = 'binary')
val_datagen = keras.preprocessing.image.ImageDataGenerator(rescale=1/255)
validation_generator = val_datagen.flow_from_directory(directory=validation_dir,target_size=(50,50),batch_size =40,class_mode ='binary')
This specific code (with this same number of epochs, steps_per_epoch and batchsize) was taken from the book deeplearning with python, by François Chollet, it's an example on page 141 of a data augmentation system. As you may have guessed, this produces the same results as the other method displayed.