Passing data from custom data generator to model.fit()

Question

I am doing the end-to-end mapping. As I have to pass two images (input and output), I have created a custom generator. My generator gets two same images with different resolutions. Right now I can only get 5 images to pass to training but I want to pass the whole generator so that all my data gets trained. As I am new to using generators and yield I don't the correct way to pass the whole generator.

import os
import numpy as np
import cv2
class image_gen():
    def __init__(self, idir,odir,batch_size, shuffle = True):          
        self.batch_index=0  
        self.idir=idir
        self.odir=odir# directory containing input images
        self.batch_size=batch_size   #batch size is number of samples in a batch   
        self.shuffle=shuffle   # set to True to shuffle images, False for no shuffle
        self.label_list=[] # initialize list to hold sequential list of total labels generated
        self.image_list=[] # initialize list to hold sequential list of total images filenames generated
        self.i_list=os.listdir(self.idir)
        self.o_list=os.listdir(self.odir)# list of images in directory      
        
    def get_images(self): # gets a batch of input images, resizes input image to make target images    
        while True:
            input_image_batch=[]
            output_image_batch=[]# initialize list to hold a batch of target images 
            sample_count=len(self.i_list)  # determine total number of images available         
            for i in range(self.batch_index * self.batch_size, (self.batch_index + 1) * self.batch_size  ): #iterate for  a batch
                j=i % sample_count # cycle j value over range of available  images
                k=j % self.batch_size  # cycle k value over batch size
                if self.shuffle: # if shuffle select a random integer between 0 and sample_count-1 to pick as the image=label pair
                    m=np.random.randint(low=0, high=sample_count-1, size=None, dtype=int) 
                else:
                    m=j   # no shuffle   
            #input
                path_to_in_img=os.path.join(self.idir,self.i_list[m])
                path_to_out_img=os.path.join(self.odir,self.o_list[m])
            # define the path to the m th image 
               
                input_image=cv2.imread(path_to_in_img)
                input_image=cv2.resize( input_image,(3200,3200))#create the target image from the input image 
                output_image=cv2.imread(path_to_out_img)
                output_image=cv2.resize(output_image,(3200,3200))
                input_image_batch.append(input_image)
                output_image_batch.append(output_image)
                    
                input_image_array=np.array(input_image_batch)
                input_image_array = input_image_array / 255.0
                output_image_array=np.array(output_image_batch)
                output_image_array = output_image_array /255.0
            self.batch_index= self.batch_index + 1 
            yield (input_image_array, output_image_array )
            if self.batch_index * self.batch_size > sample_count:
                break

This is how i get the images

batch_size=5
idir=r'D:\\train'
odir=r'D:\\Train\\train'# 
shuffle=True
gen=image_gen(idir,odir,batch_size,shuffle=True) # instantiate an instance of the class
input_images,output_images = next(gen.get_images())

This is how i train.This way i only train 5 images and not the whole dataset

model.fit(input_images,output_images,validation_data = (valin_images,valout_images),batch_size= 5,epochs = 100)

when i try to pass the whole dataset

model.fit(gen(),validation_data = (valin_images,valout_images),batch_size= 5,epochs = 1)

I get a error "image_gen" object is not callable. How should i pass the generator to model.fit()

When i pass that i get "Failed to find data adapter that can handle input: , " error — user123, Sep 24 '21 at 12:14
I recommend that you use either a simple def my_generator() function or to subclass a Sequence() implementation. — Timbus Calin, Sep 24 '21 at 12:28
In the first case, I believe your solution worked because you explicitly fetched the data and passed it through the generator. In the second case, it just tells you that the function is not callable since what you are passing is not a generator, but a class containing a generator as a method. Either subclass the Sequence() or remove the class and use def my_generator() as a single method, not a class. — Timbus Calin, Sep 24 '21 at 12:30

Timbus Calin · Answer 1 · 2021-09-24T13:06:00.700

3

The reason why you have this problem is because this error is raised when you try to access a image_gen as if it were a function, but in fact it is an object of a class.

In the first snippet you provided, you accessed in fact the method of the class which is indeed a generator, which yielded some numpy arrays that could be fed as input to the model. The second snippet however fails, because of the error described in the first paragraph.

Two possible solutions for your problem would be the following:

Use a Keras Sequence() generator.
Use a function as a generator (def my_generator(...)).

I personally recommend the first solution, as the Sequence() generator ensures that you only train once per each sample during an epoch, property which is not satisfied in case of simple function generators.

Solution for Keras Sequence() :

You need to override the Sequence class and then overwrite its methods. A complete example from the TensorFlow official documentation is:

from skimage.io import imread
from skimage.transform import resize
import numpy as np
import math

# Here, `x_set` is list of path to the images
# and `y_set` are the associated classes.

class CIFAR10Sequence(Sequence):

    def __init__(self, x_set, y_set, batch_size):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) *
        self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) *
        self.batch_size]

        return np.array([
            resize(imread(file_name), (200, 200))
               for file_name in batch_x]), np.array(batch_y)

You can use the above code as a starting point for your solution. Incidentally, it is likely your network will not train with such huge image dimensions, you could also try to lower them.

A solution for simple generator could be:

def my_generator(path_to_dataset, other_argument):
      ...
      ...
      yield image_1, image_2

 train_generator = my_generator(path_to_train,argument_1)
 val_generator = my_generator(path_to_val,argument_2)
 model.fit(train_generator,
           steps_per_epoch=len(training_samples) // BATCH_SIZE,
           epochs=10, validation_data=val_generator,
           validation_steps=len(validation_samples) // BATCH_SIZE)

edited Sep 24 '21 at 13:06

answered Sep 24 '21 at 12:35

Timbus Calin

13,809
5
41
59

I understand but my problem is that i need to return two images and also the image has to be same. I have the data in two folder and the name of the images is same but when the shuffle happen will the keras sequence API shuffle the same data for two folder or will one be something else and other something else – user123 Sep 24 '21 at 12:38
Sorry, but I don't understand. You could try to split the data into two folders or perform some logic in the __getitem__() to ensure the correct data is fetched. – Timbus Calin Sep 24 '21 at 12:43
What you are saying now has no relationship with the error that you are facing. – Timbus Calin Sep 24 '21 at 12:43
1

Why don't you use directly generator (remove the class) if you do not want to use the Sequence(). I provided you that alternative in the response I gave you. – Timbus Calin Sep 24 '21 at 12:45
Ok i will try removing the class and create a function generator – user123 Sep 24 '21 at 12:50
@user123 did you manage to solve your problem? – Timbus Calin Sep 28 '21 at 14:44
I created a function as you advice. But have not tested with the data. i will let you know once i test it. – user123 Sep 29 '21 at 18:14
i did create a function but it gives me ran out of data error.here is link to my question for the problem.https://stackoverflow.com/questions/69440110/tensorflowyour-input-ran-out-of-data-when-using-custom-generator – user123 Oct 04 '21 at 17:31
Your problem is solved. What you are posting is another probem. Since it is solved, please accept my answer. Apart from that, here is the solution to your problem: https://stackoverflow.com/questions/60509425/how-to-use-repeat-function-when-building-data-in-keras/60509810#60509810 – Timbus Calin Oct 04 '21 at 18:17
When i pass the .repeat() it gives generator doesnot have attribute repeat – user123 Oct 04 '21 at 18:19

Passing data from custom data generator to model.fit()

1 Answers1

Linked