How to use sklearn.metrics.classification_report with generator?

Question

I would like to use sklearn.metrics.classification_report but I cannot use it directly as my testing data is provided by a python generator. The python generator is given below.

Let me give you the context. I am doing speech recognition, and I am using generators to deal with memory issues. Basically, the generator loads some batchs which contain some numpy array that corresponds to some spectograms. Each batch has a unique id, so that it is pretty simple to get both input and output batch.

I am using tensorflow.keras to build my neural network. So, until now, I am able to use Model.fit (since a given version, fit_generator is deprecated) to train my model, and Model.evaluate_generator to estimate the generalization error with common numerical metrics (macro f1 score made from keras.backend, accuracy ...). But, I have no idea how to use keras.backend to code classification report and it in my list of metrics while compiling the model.

Does someone have an idea ? Any help would be appreciated ! Feel free to ask for more details and comment my code. I would love to read your tips, methodology and so on.

Many thanks !

class Pipeline:

    # some code ...

    def generator(self, set_name):
        # set_name : 'training', 'validation', or 'testing'
        input_paths = glob.glob('{}/{}/batch_input__*.npz'.format(self.path, set_name))
        random.shuffle(input_paths)
        num_inputs = len(input_paths)
        while True:
            for i in range(num_inputs):
                input_filepath = input_paths[i]
                batch_id = input_filepath.split('__')[-1].split('.')[0]
                output_filepath = '{}/{}/batch_output__{}.npz'.format(self.path, set_name, batch_id)
                features = np.load(input_filepath)['arr_0']
                labels = np.load(output_filepath)['arr_0']
                yield features, labels

Why do you have a `while True` in addition to a `for i in range(num_inputs)`? Isn't the `while` statement a bit redundant? — blacksite, Jun 23 '20 at 14:13
I had the same opinion as yours. But, I looked some tutorials, and that seems to be required by `tf.keras.models.Model.fit`. Please see Anakin's post at https://stackoverflow.com/questions/56079223/custom-keras-data-generator-with-yield — kakarotto, Jun 23 '20 at 14:21

How to use sklearn.metrics.classification_report with generator?

0 Answers0